boostorg / beast

HTTP and WebSocket built on Boost.Asio in C++11
http://www.boost.org/libs/beast
Boost Software License 1.0
4.32k stars 635 forks source link

websocket handshake differs from curl output, fails with obscure server #548

Closed larsonmpdx closed 7 years ago

larsonmpdx commented 7 years ago

beast version 67

note I have replaced the host with [host] throughout, I can privately share the real host to help testing

I'm using this curl command to handshake with a server based on a pascal websocket server (http://www.esegece.com/websockets/)

curl -v -i -N -H "Connection: Upgrade" -H "Upgrade: websocket" -H "Host: [host]" -H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" -H "Sec-WebSocket-Version: 13" https://[host]

I am trying to do the same handshake with beast, even copying the curl headers:

        beast::websocket::response_type res;
        ws.handshake_ex(res, host, "/",
                        [](beast::websocket::request_type &m)
                        {
                            m.insert(beast::http::field::accept, "*/*");
                            m.insert(beast::http::field::user_agent, "curl/7.47.0");
                            std::cout << m << std::endl;
                        },
                        ec);

        std::cout << res << std::endl;

the server accepts the curl connection but rejects the beast connection. the server's logs show that the headers are received with extra newlines or something like newlines in the case of beast:

server logs for curl connection:

Recv: GET / HTTP/1.1
Host: [host]
User-Agent: curl/7.47.0
Accept: */*
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

Sent: HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Server: sgcWebSockets 3.5

server logs for beast:

Recv: GET
Recv:  /
Recv:  HTTP/1.1

Recv: Host: [host]

Recv: Upgrade: websocket

Recv: Connection: upgrade

Recv: Sec-WebSocket-Key: lD7oPd7/ryIbKDcgimpDAA==

Recv: Sec-WebSocket-Version: 13

Recv: Accept: */*

Recv: User-Agent: curl/7.47.0

Recv: 

Sent: HTTP/1.1 302 Moved Temporarily
Connection: close
Content-Length: 0
Date: Tue, 27 Jun 2017 17:01:11 GMT
Server: sgcWebSockets 3.5
Location: login
Set-Cookie: IDHTTPSESSIONID=ANv2nTA6vZ8PuVx; Path=/

I also tested beast using mitmproxy (https://mitmproxy.org/) instead of connecting to the server directly, and it works correctly in this case, as I assume mitmproxy is putting the headers together. I assume my problem has to do with the extra newlines in the beast output. This may be standards-compliant (I don't know), but it's making my server unhappy. Is there a way to change this?

vinniefalco commented 7 years ago

https://github.com/vinniefalco/Beast/compare/sgc...larsonmpdx:sgc?expand=1

This link doesn't work for me (its local to your GitHub login session), you need to use something like this https://github.com/larsonmpdx/Beast/commits/sgc

the template complains:

When I say 'remove the multi buffer' I mean remove it from the call site but also remove the DynamicBuffer template parameter from the class template and everywhere its used in the implementation!

do I need to change the read_some_op() stuff or delete it?

You will need a write_some_op so you can take the read_some_op and rename it and empty out the implementation of operator(), remove whatever you need to in order to make it compile.

Probably put your name at the top, Copyright 2017 and your email if you want, or your GitHub profile link (or nothing)

The include guard is named BEAST_EXAMPLE_COMMON/FLAT_WRITE_STREAM_HPP

Don't include multi_buffer.hpp

Most everything in there is indented one unnecessary level to the right

You don't need the buffer or capacity member functions

The read functions look okay.

The write functions you will need to define outside the class since they will be many lines (and we don't want them to be implicitly marked inline).

Add the signature for the function I gave you find_single and implement it. Now we will edit the synchronous write function to have this logic:

  1. If buffer_size(buffers)>0 call find_single on the ConstBufferSequence

  2. If buffer_size(buffers) == 0 or find_single returned an iterator just call next_layer_.write with a value of buffers or boost::asio::const_buffers_1{*iter} (the iterator you found) and return the result. This is the efficient/transparent path.

  3. Otherwise, if buffer_size(buffers) <= max_stack_size where max_stack_size is a private class constant set to something like, idk. 4KB, call a private function which declares a stack variable char buf[max_stack_size] and uses boost::asio::buffer_copy to flatten the input into the variable. Then call next_layer_.write() on that variable and return the result.

  4. If none of the above then use std::unique_ptr<char[]> buf{new char[buffer_size(buffers)]} to get a temporary piece of memory and boost::asio::buffer_copy on the input into this temporary memory. Then call next_layer_.write() on that temp memory and return the result. You will want to wrap the call to new in a try/catch and convert the exception into an error_code.

Once you have this compiling and working, you should be able to connect to the sgc server with no trouble. Publish the work, and I will help you write the asynchronous version which lives in the write_some_op. It uses the same logic as the synchronous but we have to allocate the memory using the allocation hook and also manage the final completion handler. And of course there is the "inversion of control" that comes with callback style programming.

We can add a test for the both the synchronous and asynchronous versions which uses the test::pipe class which simulates a socket. And then this will be part of Beast! How exciting!

larsonmpdx commented 7 years ago

ok, makes sense. I rarely write templates so this is a bit of a, uh, growth opportunity.

yeah, I largely ignore formatting these days so I've gotten bad at it. I just define a .clang-format file and run clang-format before checkin. It should be easy enough to copy the existing formatting

the rest of that is detailed enough I think I can paint-by-numbers. thanks for letting me work through it

vinniefalco commented 7 years ago

thanks for letting me work through it

Yep! I thought this would be good. The composed operation is the tricky part, there's a tutorial in the documentation although this composed operation is simple enough that it does not need the handler_ptr. I will help on any trouble spots. Good luck!

larsonmpdx commented 7 years ago

I put together enough to test a sync call below max stack size: https://github.com/larsonmpdx/Beast/commit/f71467173a4b840875275067b48320d0242d827f

I left some debugging printlines in there, the output is:

find_single working on 1 buffers
found a single nonzero buffer
find_single working on 1 buffers
found a single nonzero buffer
find_single working on 1 buffers
found a single nonzero buffer
find_single working on 1 buffers
found a single nonzero buffer

so it's being called with the individual pieces and the output is unchanged. Can you take a look?

vinniefalco commented 7 years ago
vinniefalco commented 7 years ago

oops... found the problem. I got the order of the wrappers wrong. The flat_write_stream goes on the outside and the ssl::stream goes on the inside. Here's a branch that works: https://github.com/vinniefalco/Beast/commits/sgc2

larsonmpdx commented 7 years ago

awesome, thanks

larsonmpdx commented 7 years ago

hey, it works now! I'll implement the rest of your suggestions and do some cleanup and try tests/async

vinniefalco commented 7 years ago

Sounds like this is resolved, thanks! If the problem resurfaces feel free to open a new issue (or re-open this one)

larsonmpdx commented 7 years ago

yes, thanks! working sync functions are enough for me for now. I do plan to finish this but my schedule is full till August 23

vinniefalco commented 7 years ago

Could I talk you into offering a quick Boost formal review? It would take less than an hour, all you have to do is subscribe to the mailing list and make a post with your experiences with the library. The questions to answer are:

It would mean a lot. Sign up here: http://www.boost.org/community/groups.html#main

Use the same email to send the message that you use to sign up

larsonmpdx commented 7 years ago

I am back. sgcwebsockets was updated and 4.1.1 has a fix note - http://www.esegece.com/history/sgcWebSockets.txt but when I tested this with a branch with and without flat_write_buffer, it appears to not be fixed

I updated flat_write_stream.hpp with the new APIs and put it here with sync and coro examples:

https://github.com/larsonmpdx/Beast/commit/3efcbbd155b2f7e4dd5f7cf9ed88f102f2ca8a4b

can you help me understand what's needed for the async bits? here is your old comment:

It uses the same logic as the synchronous but we have to allocate the memory using the allocation hook and also manage the final completion handler. And of course there is the "inversion of control" that comes with callback style programming.

P.S. I had trouble getting things building using boost 1.62 without adding some things to cmakelists (see that commit). am I misunderstanding something now that it's slated for boost?

vinniefalco commented 7 years ago

You shouldn't need your own class anymore if they fixed the bug.

What version of Beast are you using?

larsonmpdx commented 7 years ago

I think they didn't fix the bug, or fixed it incorrectly

I rebased to 108

vinniefalco commented 7 years ago

CMakeLists.txt in the Boost version of Beast only works with Visual Studio, and you have to put beast "in-tree". That means, you have to put beast at $BOOST_ROOT/libs/beast

larsonmpdx commented 7 years ago

ah, ok. the hacked-in bits are working OK so I might just wait till the next boost comes out

vinniefalco commented 7 years ago

You shouldn't need Beast's CMakeLists.txt to use Beast in your own projects. Anyway, Beast should be available in Boost 1.66.0 and later :)

larsonmpdx commented 7 years ago

so I edited async_write_some to basically do the same thing as write_some, and it seems to work ok. what's the deal with the allocator and final completion handler you mentioned?

vinniefalco commented 7 years ago

Did you push your code? I don't see your composed operation

larsonmpdx commented 7 years ago

I pushed after you looked, see here also with some cleanup. I am not handling the allocation error, just returning

https://github.com/larsonmpdx/Beast/commit/d55fa52ce152ffef6b9fda2c5cc35ff609e7dd53

it does work for me. I still need to learn more about boost coroutines, I am getting an error with my own code. thanks for the great async and coroutine examples btw

vinniefalco commented 7 years ago

The code that you posted is clearly incorrect and also produces undefined behavior. You have no composed operation! write_some_op is never constructed or invoked. The implementation of flat_write_stream::async_write_some needs to construct an object of type write_some_op and invoke it. The instance of write_some_op needs to hold all of the state information (especially the unique_ptr) which must persist until the handler is invoked. Follow the example code from the documentation: http://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/using_io/writing_composed_operations.html

larsonmpdx commented 7 years ago

obviously I was not understanding this, haha. that's good documentation, I'll fix things and report back

larsonmpdx commented 7 years ago

I did some reading, here are some more references I found:

asio discussion: https://sourceforge.net/p/asio/mailman/asio-users/thread/CADH4Q2gCUqt6fnCV51R97HsSeTSZwKP1ban3YWG2YbPqgn_Ebw%40mail.gmail.com/#msg35996833

you reference this project (homekit layer on beast): https://github.com/djarek/gabia/blob/master/include/gabia/secure_socket.hpp

detect_ssl.hpp has a similar structure: https://github.com/boostorg/beast/blob/master/example/common/detect_ssl.hpp

I still don't understand the overall flow here. the actual work being done should be trivial but there's lots of boilerplate and I'm having trouble figuring out the templates, asio, and beast

when I instrument the code (and have it set up to not touch any buffers/ change anything) I see this call sequence:

constructor continuation allocate deallocate handler_invoke operator() with 3 bytes transferred

in looking at the examples they have a state machine set up in operator(). but at this point I'm already seeing 3 bytes transferred and it's too late to combine the buffer - right? Where should the buffer combining code be placed?

vinniefalco commented 7 years ago

You combine the buffer when you construct the composed operation and then in operator() just write it.

larsonmpdx commented 7 years ago

asio is kinda hard! I put this together but there is some cargo cult stuff I don't get, can you take a look?

https://github.com/larsonmpdx/Beast/pull/2/files#diff-847b4ba38b786001f950446e30b1ef79

I'm not using a state/ state pointer helper inside the composed op, just shared_ptr for the buffer and pass-by-value for the rest. does this make sense?

I can't figure out the syntax for the next_layer async_teardown() in case of allocation error:

https://github.com/larsonmpdx/Beast/pull/2/files#diff-847b4ba38b786001f950446e30b1ef79R425

my compiler tells me "‘class boost::asio::ssl::stream<boost::asio::basic_stream_socket&>’ has no member named ‘async_teardown’" and I don't know what to put for the role_type field

vinniefalco commented 7 years ago

asio is kinda hard!

Yes, writing a composed asynchronous operation is not for beginners. I'm glad that you're attempting it though.