meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.23k stars 2.48k forks source link

Segmentation fault when wss and admin wss are enabled #913

Closed tgabi333 closed 7 years ago

tgabi333 commented 7 years ago

When i try to connect to swss it hangs up, janus process has quit with segmentation fault.

Config:

Actions:

After admin_wss is turned off, it works. Using libwebsockets 2.2.1

Logs: there is no relevant log event with Stack trace: https://pastebin.com/aHMfj8mY

lminiero commented 7 years ago

Looks like a libwebsockets issue. Definitely not happening to me, but I'm on an older version. You should try to compile a debug version of the library, because the stacktrace is useless as it is.

tgabi333 commented 7 years ago

Tested libwebsockets: 1.5, 1.6, 1.7.9 Failed: 2.0.0, 2.1.0, 2.2.0, 2.2.1

sdrsdr commented 7 years ago

same here libwebsockets-2.1.1 :( Also sometimes it gets coredump sometimes it just hangs. I'll try to get a proper stack trace ..

sdrsdr commented 7 years ago

backtrace:

Thread 29 "sws thread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc63c6700 (LWP 30973)]
0x00007fffc6bdc6c2 in lws_ssl_server_name_cb (ssl=0x7fffc0000ab0, ad=0x7fffc63c5810, arg=0x0) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/ssl-server.c:171
171     /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/ssl-server.c: No such file or directory.
(gdb) backtrace 
#0  0x00007fffc6bdc6c2 in lws_ssl_server_name_cb (ssl=0x7fffc0000ab0, ad=0x7fffc63c5810, arg=0x0) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/ssl-server.c:171
#1  0x00007ffff6e48a10 in ssl_parse_clienthello_tlsext () from /usr/lib64/libssl.so.1.0.0
#2  0x00007ffff6e2e9d7 in ssl3_get_client_hello () from /usr/lib64/libssl.so.1.0.0
#3  0x00007ffff6e33377 in ssl3_accept () from /usr/lib64/libssl.so.1.0.0
#4  0x00007ffff6e42e1f in ssl23_accept () from /usr/lib64/libssl.so.1.0.0
#5  0x00007fffc6bdc233 in lws_server_socket_service_ssl (wsi=0x7fffc00008c0, accept_fd=7) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/ssl.c:608
#6  0x00007fffc6be5083 in lws_adopt_socket_vhost (vh=0x6fbb20, accept_fd=7) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/server.c:1595
#7  0x00007fffc6be5b2f in lws_server_socket_service (context=0x702890, wsi=0x6f4440, pollfd=0x70c9e8) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/server.c:1946
#8  0x00007fffc6bd3531 in lws_service_fd_tsi (context=0x702890, pollfd=0x70c9e8, tsi=0) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/service.c:828
#9  0x00007fffc6be1193 in lws_plat_service_tsi (context=0x702890, timeout_ms=50, tsi=0) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/lws-plat-unix.c:184
#10 0x00007fffc6be122f in lws_plat_service (context=0x702890, timeout_ms=50) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/lws-plat-unix.c:204
#11 0x00007fffc6bd3a67 in lws_service (context=0x702890, timeout_ms=50) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/service.c:1139
#12 0x00007fffc6df3faf in janus_websockets_thread (data=0x702890) at transports/janus_websockets.c:998
#13 0x00007ffff73045d5 in ?? () from /usr/lib64/libglib-2.0.so.0
#14 0x00007ffff5e3c374 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ffff5b8760f in clone () from /lib64/libc.so.6
(gdb) 
lminiero commented 7 years ago

Again, libwebsockets issue. It happens deep within the library code, and in crypto related code, so not sure what you want me to do. Either use an older library version, or use plain websockets in Janus and proxy them to WSS in nginx/httpd/haproxy/whatever.

sdrsdr commented 7 years ago

I'll be happy with a notice in the config file saying "enabling wss admin interface with libwebscok > 2 is reported unstable leading to crashes due to libwebsockets internal issues. Test with versions above 2.2.1 and report any success" :)

tgabi333 commented 7 years ago

Or just tell in the documentation that the lastest supported version of libsockets is 1.7.9

sdrsdr commented 7 years ago

Bah documentation :) Who reads documentation nowadays 😸 To hit this issue you need to modify the default config so there are better chances to notice it there

tgabi333 commented 7 years ago

@sdrsdr i built up janus from documentation, i think everyone follows that :)

lminiero commented 7 years ago

Fair point. Just FYI, I'm on 2.0 and it works fine for me, so it's not "1.7 or nothing". I'll keep the issue open so that I can test myself, with an updated lws version, when I have some time to look into this.

sdrsdr commented 7 years ago

@tgabi333: I read the documentation on building up until the dependency list :) then it's try-err until it compiles with the feature I need, yet I've read all the lines around the features I've tried to enable in my configs latter

lws-team commented 7 years ago

It's probably wise to keep an open mind since lws works just fine with its own test apps for this.

Also if the bt earlier is supposed to be for 2.2.1, the line # does not correspond to something capable to blow a segfault...

https://github.com/warmcat/libwebsockets/blob/v2.2.1/lib/ssl-server.c#L171

So what is that about?

lminiero commented 7 years ago

@lws-team never meant to throw it on you, of course... I've interacted with you guys on your repo more than once, and you've always been very responsive and helpful. I thought of a lws issue as I've seen a couple of other projects getting what looked like the same issue (I think one was mosquitto but not sure). Anyway, as I said I plan to update my installation of lws when I have some time to investigate (I'm too busy to do that right now), as somebody else told me about the test apps working too, so it might indeed be a problem in how we use the library.

The only SSL related thing we might do, is passing the certificates to the stack. Can it be something changed there that might cause issues? At the moment we pass, for both cert and key, the path as a string that is then deallocated after the initialization of our module is done. This is definitely working in my setup, and was working before, which makes me think at the time the path string was copied internally, and then the copy used. Maybe in more recent versions the string is used as is, and so we should pass a strdup-ed string instead to ssl_cert_filepath and ssl_private_key_filepath?

lws-team commented 7 years ago

Understood... it needs a bt with the exact lws version told (and probably debug build disabling optimizations to understand where it blows).

It can be an issue with lws, but then the explanation for that has to cover why lwsws and the test server, and other users don't have the problem. The fact you're using later lws ok than the guy with the problem is also suspicious maybe the issue is on his side.

Actually at the point in the bt near where it dies we are out of the ssl lib and in an lws callback whose job is to figure out which vhost the client wants. Generally there is no problem there (eg https://warmcat.com and https://libwebsockets.org are vhosts on the same server served by lwsws using that code). If the guy with the problem can get a better bt it should guide us better.

lminiero commented 7 years ago

One thing you guys ( @tgabi333 @sdrsdr ) can do as a quick check, is commenting out the janus_config_destroy at this line here:

https://github.com/meetecho/janus-gateway/blob/master/transports/janus_websockets.c#L708

This will make sure the strings we passed to the lws stack as crypto paths are not deallocated (it will cause a leak but not relevant right now). If that doesn't crash for you, it might mean the problem might be what I described before. If not, we'll have to look elsewhere.

tgabi333 commented 7 years ago

thanks @lminiero i will look at it in the following days

sdrsdr commented 7 years ago

my stack trace points to https://github.com/warmcat/libwebsockets/blob/v2.1.1/lib/ssl-server.c#L171 and the main suspect is context being NULL

sdrsdr commented 7 years ago

In successful connections with wss disabled I get [2017/06/19 15:16:09:0316] ERR: SNI: Unknown ServerName: sdrdev.shelly.cloud this might be related as the code in question seems to deal with virtual hosts. I have

127.0.0.1       noah sdrdev.shelly.cloud localhost

in my /etc/hosts so I can test properly with our *.shelly.cloud cerificate

I hope this helps.

sdrsdr commented 7 years ago

Not freeing configs seems not to help :(

(gdb) list janus_websockets.c:708
703                                     }
704                                     g_free(ip);
705                             }
706                     }
707             }
708             //janus_config_destroy(config);
709             config = NULL;
710             if(!wss && !swss && !admin_wss && !admin_swss) {
711                     JANUS_LOG(LOG_WARN, "No WebSockets server started, giving up...\n");
712                     return -1;      /* No point in keeping the plugin loaded */
(gdb) backtrace 
#0  0x00007fffc6bdc6c2 in lws_ssl_server_name_cb (ssl=0x7fffc0000ab0, ad=0x7fffc63c5840, arg=0x0) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/ssl-server.c:171
lminiero commented 7 years ago

To correct myself: I'm on 2.1.0, not 2.0, the version my Fedora 25 ships, and secure websockets work just fine for me. Tested with this small JavaScript console snippet:

ws.onmessage = function(msg) { console.log(msg.data); };
ws.onopen = function() { console.log("Connected"); ws.send(JSON.stringify({janus: "info", transaction: "123"})); };
ws.onclose = function() { console.log("Closed") };

If the strings are not the cause, then I really don't know what might be... I'll try installing the lws master to see if enabling wss and using the above snippet will break it for me.

lminiero commented 7 years ago

Just installed 2.2.1 and it works just fine for me. I'm not doing any virtualhost stuff so that may be it.

lws-team commented 7 years ago

@sdrsdr it looks like it typically just works in Janus. And I know it just works in the lws test apps / lwsws provided with lws.

sdrsdr commented 7 years ago

I suspect some sort of stack protection or compiler optimization at play here. I am running with -O0 -g but yet I get this error. Also my gcc is 5.4 that is not very fancy in 2017. I was planing to run the tests from lws and see what I get. I'm not sure if I use vhost as I'm not familiar with metecho code yet. I was planing full step by step debug of this code as it seems like a bug that needs to be fixed. Yet this will be done latter as it seems non critical to my current task.

lws-team commented 7 years ago

It does feel like it's something specific on your side, platform, toolchain or just your code around it. People are still building lws on gcc3 and gcc4 so it shouldn't be gcc5 in general.

Anyway if you later dig up something specific in the crash that relates to lws I will be happy to read about it.

lminiero commented 7 years ago

Any update or new test result on this? We don't specify/use any vhost thing in our plugin, so I wouldn't know if there's anything we do (or don't do) that might result in a problem there. Are we supposed to clear some fields when initializing the contexts, for instance?

lws-team commented 7 years ago

You are asking about lws? Look to the lws test apps in lws to see what you should do... if your info struct is coming from the stack, yes you must memset it to 0 first so the fields you don't care about initializing are at a reasonable default.

lminiero commented 7 years ago

It was more a question for @sdrsdr as he said he wanted to make some more tests, and so I was interested to know if he found out more. Thanks for the clarification, though! About the info struct, we're already memsetting the lws_context_creation_info instance to 0 before filling in the fields and calling lws_create_context. Just to give you an idea, this snippet here is how we initialize a secure websockets server for versions >= 2.0:

    struct lws_context_creation_info info;
    memset(&info, 0, sizeof info);
    info.port = wsport;
    info.iface = ip ? ip : interface;
    info.protocols = swss_protocols;
    info.extensions = NULL;
    info.ssl_cert_filepath = server_pem;
    info.ssl_private_key_filepath = server_key;
    info.gid = -1;
    info.uid = -1;
    info.options = LWS_SERVER_OPTION_DO_SSL_GLOBAL_INIT;
    swss = lws_create_context(&info);

Anything weird here popping to the eye? Most of the work on the plugin was done a long time ago and then I updated it along the way, with #ifdefs and the like, in order to keep it working as new versions came out, so there may very well be things I'm doing wrong...

lminiero commented 7 years ago

One thing I see in your test server is this options property here: https://github.com/warmcat/libwebsockets/blob/master/test-server/test-server.c#L403

info.options = opts | LWS_SERVER_OPTION_VALIDATE_UTF8 | LWS_SERVER_OPTION_EXPLICIT_VHOSTS | LWS_SERVER_OPTION_DO_SSL_GLOBAL_INIT;

which does mention vhosts explicitly. My options only has LWS_SERVER_OPTION_DO_SSL_GLOBAL_INIT, but not the other two, so that's something that might be worth testing for those who get the crashes.

lws-team commented 7 years ago

No... if you don't create the vhosts yourself, lws creates one called 'default' when you create the context. So it's OK to do that. If you want multi vhosts, or stuff like different ssl certs per vhost, you have to create them yourself.

You should maybe back up and definitively describe the segfault location and stuff around that. IIRC it was a bit confused before.

lws-team commented 7 years ago

... and reading back through the comments it seems to work on recent lws. So you also need to qualify what version has whatever problem.

lminiero commented 7 years ago

The problem is that it never crashed for me, so I have no way to replicate and look where it happens. Secure websockets work just fine in my setup, and libwebsockets works great for me: the issue apparently only happens for a few users.

This is a stacktrace that @sdrsdr provided a few weeks ago when it crashed for him instead:

Thread 29 "sws thread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc63c6700 (LWP 30973)]
0x00007fffc6bdc6c2 in lws_ssl_server_name_cb (ssl=0x7fffc0000ab0, ad=0x7fffc63c5810, arg=0x0) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/ssl-server.c:171
171     /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/ssl-server.c: No such file or directory.
(gdb) backtrace 
#0  0x00007fffc6bdc6c2 in lws_ssl_server_name_cb (ssl=0x7fffc0000ab0, ad=0x7fffc63c5810, arg=0x0) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/ssl-server.c:171
#1  0x00007ffff6e48a10 in ssl_parse_clienthello_tlsext () from /usr/lib64/libssl.so.1.0.0
#2  0x00007ffff6e2e9d7 in ssl3_get_client_hello () from /usr/lib64/libssl.so.1.0.0
#3  0x00007ffff6e33377 in ssl3_accept () from /usr/lib64/libssl.so.1.0.0
#4  0x00007ffff6e42e1f in ssl23_accept () from /usr/lib64/libssl.so.1.0.0
#5  0x00007fffc6bdc233 in lws_server_socket_service_ssl (wsi=0x7fffc00008c0, accept_fd=7) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/ssl.c:608
#6  0x00007fffc6be5083 in lws_adopt_socket_vhost (vh=0x6fbb20, accept_fd=7) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/server.c:1595
#7  0x00007fffc6be5b2f in lws_server_socket_service (context=0x702890, wsi=0x6f4440, pollfd=0x70c9e8) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/server.c:1946
#8  0x00007fffc6bd3531 in lws_service_fd_tsi (context=0x702890, pollfd=0x70c9e8, tsi=0) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/service.c:828
#9  0x00007fffc6be1193 in lws_plat_service_tsi (context=0x702890, timeout_ms=50, tsi=0) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/lws-plat-unix.c:184
#10 0x00007fffc6be122f in lws_plat_service (context=0x702890, timeout_ms=50) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/lws-plat-unix.c:204
#11 0x00007fffc6bd3a67 in lws_service (context=0x702890, timeout_ms=50) at /var/tmp/portage/net-libs/libwebsockets-2.1.1/work/libwebsockets-2.1.1/lib/service.c:1139
#12 0x00007fffc6df3faf in janus_websockets_thread (data=0x702890) at transports/janus_websockets.c:998
#13 0x00007ffff73045d5 in ?? () from /usr/lib64/libglib-2.0.so.0
#14 0x00007ffff5e3c374 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ffff5b8760f in clone () from /lib64/libc.so.6
(gdb) 

and from what I can read in the comments that followed, it might have been related to his own setup and how virtual hosts are setup in that case. I can't seem to find a comment from him where it says it eventually worked for him instead, though?

lminiero commented 7 years ago

So setting LWS_SERVER_OPTION_EXPLICIT_VHOSTS and calling lws_create_vhost as you do in the test server, or not setting LWS_SERVER_OPTION_EXPLICIT_VHOSTS as we do, is equivalent? Asking as that looks like the only relevant difference in how the test server and my plugin work when creating the secure backend, and from the previous comments it looks like the test server works fine for those that have the crashes in Janus instead.

Just to be 100% sure, I'll create a quick PR that people can test, that tries to more closely mimick the setup the test server follows. Hopefully this will give people something else to test, and hopefully this will fix this issue for good :slightly_smiling_face:

Thanks for your patience and for indulging me on this issue!

lws-team commented 7 years ago

Yeah.... if the EXPLICIT_VHOSTS flag is not set at context creation time, after creating the context it re-uses the struct to configure a default vhost as well. The same struct is used to create contexts and vhosts; different members apply to either the vhost or the context (which apply to what is documented in the doxygen comments inline). So they should be the same. No harm trying it though.

The problem was I couldn't nail down ssl-server.c:171 on 2.1.1 to a meaningful line of source that could segfault. So to solve it it needs to be reproduced, identified exactly where it blows and dump the related vars. It was suspected context was NULL, that is coming via Openssl api, so the version of OpenSSL and anything else around that is also interesting.

I see I already asked these questions -->

@sdrsdr it looks like it typically just works in Janus.  And I know it just works in the lws test apps / lwsws provided with lws.

 - can you run under gdb and confirm context actually is NULL?  Set the stack frame with `f <n>` where <n> is the brame number from the backtrace, and then `p context`.

 - What version of openssl are you using?

 - the context is set in lws_context_init_server_ssl()... do you have logging from lws?  It will clarify the ssl init state

 - are you using multiple vhosts?  The vhost must be given a name matching the external hostname if so... when you use the default vhost, lws context init creates it with the name "default".  If you create your own vhosts, it's your problem to set the name field in the info struct.
lminiero commented 7 years ago

Yep, which is why I tried to revive the discussion and ping @sdrsdr to see if he had any news. Pinging @tgabi333 as well as he opened the issue in the first place.

sdrsdr commented 7 years ago

Um... I was hijacked to another project but I'm now back at working with janus so don't close the issue yet .. :)

lminiero commented 7 years ago

No worries, I've closed enough issues for today :smile:

tgabi333 commented 7 years ago

i'm here too, but sorry guys this c++ is not my world, so i'm afraid i cannot help you more

lminiero commented 7 years ago

The questions Andy made are not code related. It's a matter of providing a gdb stacktrace that can help figure out the value of context as indicated, give info on the OpenSSL version, and possibly run Janus with websockets debugging enabled (not sure which debugging level gives the best results). I think you can ignore the vhosts question, as we don't use them in the code.

emustafa commented 7 years ago

Hi all.

I faced the same issue. It was crashing at the same place and the context was NULL. For my case, I am able to replicate the scenario and I know why it happens. Apparently, if I create 2 contexts that serve as ssl servers, then the one that is created first will crash. Another case is, if I host a ssl server and connect to it within the same application using different context then I get the same crash. However, this time I am not sure if it is crashing for the client or the server (the project I am working on is highly multithreaded and it is hard track down but I can probably do it if you would like to know).

I am guessing when the 2nd context is created it somehow overwrites the first contexts data?

Is this expected behaviour? If so what is the right way of having 2 ssl contexts?

Thanks.

lws-team commented 7 years ago

@emustafa that makes more sense. Lws contexts are not entirely just logical containers you can have n of, especially if you use SSL. Without SSL, lws contexts contain processwide fd maps... these don't conflict but it's not memory efficient. If the contexts are served by a single thread, an idle context waits in its event loop increasing latency for the other ones, and the (minimal) signal handling will conflict; if they are served by different threads neither of those make trouble.

But the biggies are around OpenSSL... first depending on your OpenSSL version its library init cannot survive being initialized and destroyed multiple times in one process (also causing disasters if you link more than one library that wants openssl into the same process). 1.1+ OpenSSL introduced new apis that lws detects and uses which should avoid this. The second is OpenSSL seems to require process scope accessors in the callback

https://github.com/warmcat/libwebsockets/blob/master/lib/ssl-server.c#L41

I don't know what will happen if there are two library inits.

Everything can be done in one context. You can have multiple vhosts sharing or with individual listening sockets, the vhosts have their own ssl contexts, and you can have as many client connections as you like. These can all coexist in one context with one event loop on one thread. That's what I recommend and can support.

On the other hand, depending on the root cause (I guess the library scope statics) because this problem comes quite late in the day, I wonder if there's a workaround. If you want to continue down that road let me know

and any other debugging you can glean.

lminiero commented 7 years ago

Thanks for these additional details! I can confirm that, while everything works fine with just one secure context, if I create two of them (one for the Janus API, one for the Admin API) it segfaults for me too.

@lws-team answering your question, and addressing your comment on the OpenSSL initialization, may it be the problem in our implementation is that we set the info.options to LWS_SERVER_OPTION_DO_SSL_GLOBAL_INIT for both contexts? I can remove it for the second context, in case we know we've done it already, but since they happen in different threads (each context has its own, so that we can do lws_service separately) we'd have to implement some sort of locking to make sure each context is created in sequence.

Edit: scratch the thread part... the context creation is done in the same thread, in sequence, the threads only come into play when doing lws_service per each context. I'll try only setting the SSL global init once to see if it fixes it for me.

lminiero commented 7 years ago

That didn't work, as omitting LWS_SERVER_OPTION_DO_SSL_GLOBAL_INIT from the context initialization resulted in a plain, non-secure, context (which was to be expected, we discussed the need of that option in the past).

Reading your post again, I realize I may have misunderstood what you were suggesting. I'll have a look at the single context + vhosts to see how those work. I'm afraid that will require some refactoring in the plugin, but if it manages to solve the issue I'll have to pull that tooth.

lminiero commented 7 years ago

Please see the referenced PR, which fixes the issue for me. If it works for you all, I'll merge and close the issue.

emustafa commented 7 years ago

@lws-team , Thank you for detailed answer. I understand the problem now. I am actually using boringssl due to some other reasons. And I am not building libwebsockets with cmake but with something else again due to some other reasons. So far this setup worked fine.

So my setup is like this. I have 3 clients and 2 servers within one application. 2 clients are connecting to remote servers so they have nothing to do with other 2 servers that are running in the same process. My 1st server is not ssl. 2nd server is ssl though. And my 3rd client is connecting to 2nd server over ssl within the same process. This entire thing might seem a bit bad architecture design but lets not get into that...

So based on what you suggested, I can just create 1 context and then create 5 vhosts: 3 for clients and 2 for servers and then just have 1 thread to serve the context. Is this the way it should be?

@lminiero , Thanks for your suggestion as well. However, I am not using janus-gateway. I am directly using libwebsockets. The issue was closely related to this one here and that is why I commented here. So I won't be able to test the PR. But I looked at it and it looks like you are using vhosts which then should fix the issue based on what @lws-team suggested.

lws-team commented 7 years ago

@lminiero one context + vhosts is indeed the best way to handle it.

@emustafa You only need a vhost for each "virtual server", so one on port 80 without SSL and one on port 443 set up for SSL from what you described. When you create the SSL-capable vhost also call this on it

/**
 * lws_init_vhost_client_ssl() - also enable client SSL on an existing vhost
 *
 * \param info: client ssl related info
 * \param vhost: which vhost to initialize client ssl operations on
 *
 * You only need to call this if you plan on using SSL client connections on
 * the vhost.  For non-SSL client connections, it's not necessary to call this.
 *
 * The following members of info are used during the call
 *
 *   - options must have LWS_SERVER_OPTION_DO_SSL_GLOBAL_INIT set,
 *       otherwise the call does nothing
 *   - provided_client_ssl_ctx must be NULL to get a generated client
 *       ssl context, otherwise you can pass a prepared one in by setting it
 *   - ssl_cipher_list may be NULL or set to the client valid cipher list
 *   - ssl_ca_filepath may be NULL or client cert filepath
 *   - ssl_cert_filepath may be NULL or client cert filepath
 *   - ssl_private_key_filepath may be NULL or client cert private key
 *
 * You must create your vhost explicitly if you want to use this, so you have
 * a pointer to the vhost.  Create the context first with the option flag
 * LWS_SERVER_OPTION_EXPLICIT_VHOSTS and then call lws_create_vhost() with
 * the same info struct.
 */
LWS_VISIBLE LWS_EXTERN int
lws_init_vhost_client_ssl(const struct lws_context_creation_info *info,
              struct lws_vhost *vhost);

Then when you create your outgoing client connections you can bind them to the SSL-capable vhost (via client_info.vhost when creating the client connection), that will be able to handle both https/wss and http/ws client connections. For client connections the binding to a vhost is quite loose, it's mainly a place to hang the appropriately-configured client SSL_CTX. If you are doing something complex like multiple client connections using client certs with different CAs, then you would need extra vhosts just to hold the differently-configured client SSL_CTXes. Otherwise you can just use an existing vhost also for client connections. There's no limit to how many ongoing client connections can be active in one context or vhost.

emustafa commented 7 years ago

@lws-team .Thanks Andy. I will give it a try and let you know here how it went.

lminiero commented 7 years ago

Pinging @tgabi333 and @sdrsdr then for confirmation that the fix in the PR works for them.

tgabi333 commented 7 years ago

It works for me using libwebsockets 2.3.0.

emustafa commented 7 years ago

@lws-team I tried vhosts and everything works great! Thanks!

But I am creating vhost every time I host a server or create connection. That makes the coding simpler for me.

lws-team commented 7 years ago

Great... it's also OK to do that, vhosts are very lightweight except for the SSL contexts. Generally people will just create them along with the context though.