Open physiii opened 6 years ago
I can't see your whole code but this alone looks like it may cause your code to continuously create additional connections until OOM, once you had the first successful connect.
if (token_connect && token_conn_count >= 10) {
In that case the client connections find no reason to close. New ones keep getting created when token_conn_count hits 10.
Okay, little different method:
I should probably wait for a response before trying to reconnect. So I need to set token_connect true when I receive a connection error:
re-connecting with token protocol 4: lws_client_connect_2: 0x3ffd7e80: address 192.168.0.10 4: Connect failed errno=128
I have been using callbacks like LWS_CALLBACK_CLOSED and SYSTEM_EVENT_STA_GOT_IP but how do I get LWS_ERRNO from client-handshake.c in a callback so I can try the connection again?
I think you are missing the point... once you opened a connection, why would it ever close? But your code keeps opening new connections at intervals. So there is an OOM eventually... resetting the server forces all your open connections to close, avoiding the OOM...
You don't need to know errno, lws will retry the connect if the error is nonfatal, else inform you with LWS_CALLBACK_CLIENT_CONNECTION_ERROR
that it met something fatal.
Okay I see why there is an OOM.
I don't think it retries the connection because I don't see LWS_CALLBACK_CLIENT_WRITEABLE
after I reset my server - just LWS_CALLBACK_HTTP_DROP_PROTOCOL
then LWS_CALLBACK_CLOSED
so I can't write to that socket anymore. It also doesn't retry the connection if I attempt lws_client_connect_via_info
before SYSTEM_EVENT_STA_GOT_IP
so I wait for that until I try to connect.
Why do I no longer get LWS_CALLBACK_CLIENT_WRITEABLE
if it is retrying the connection?
It retries the accept you showed erroring out, if nothing fatal happened. You must do whole reconnects in the way you have been, but with a bit more care tracking the state of any existing connect.
The accept and particularly SSL_accept() are multistep things requiring network roundtrips. Because LWS is nonblocking, things like accept() that normally just stall your thread until they complete return immediately and need to be retried later. LWS takes care of that for you.
WRITEABLE only comes when you got a successful connection, and asked for it.
Thank you for clarification and patience.
I added a connect flag back to the lws_service
loop so when I receive LWS_CALLBACK_CLOSED
I set the connect flag true triggering lws_client_connect_via_info
on that socket - I then wait for LWS_CALLBACK_CLIENT_ESTABLISHED
to set the connect flag false again.
Problem is lws_client_connect_via_info
runs again before I get LWS_CALLBACK_CLIENT_ESTABLISHED
and creates redundant connections - how can I know if a connection failed after lws_client_connect_via_info
? I would wait until I get LWS_CALLBACK_CLIENT_CONNECTION_ERROR
to try again but I never see that callback - just an error message like Connect failed errno=128
.
Also when you say ask for a connection I assume that means running lws_client_connect_via_info
There is a test client example in lws... although this is like the only demo code for ESP32 actually it's all based on normal lws, where there are more examples.
https://github.com/warmcat/libwebsockets/blob/master/test-apps/test-client.c
what it does is treat the client wsi pointer returned by lws_client_connect_via_info()
as the flag. If it's NULL, it will try to connect, after considering a ratelimit. If the client closes or gets LWS_CALLBACK_CLIENT_CONNECTION_ERROR
, it sets the copy of the client wsi to NULL, signalling it should retry.
Note that LWS_CALLBACK_CLIENT_CONNECTION_ERROR
comes on the first protocol of the vhost, ie, vhost->protocols[0]. It's because the active protocol is not negotiated until there has been a successful connection.
Okay getting close.
I'm using your ratelimit function and client wsi pointer as a flag but it's not being set NULL if an attempt was made but the server is not running. Since no connection is made, LWS_CALLBACK_CLOSED
is not called to set wsi pointer NULL and since there's no indication lws_client_connect_via_info
failed besides Connect failed errno=128
, it never retries.
I also tried setting client wsi point NULL making another attempt if LWS_CALLBACK_CLIENT_ESTABLISHED
isn't called after some time but that gives an OOM.
I want to reconnect to a socket if it is closed.
So I add a connect flag to LWS_CALLBACK_CLOSED
and that starts lws_client_connect_via_info in main.c
This works when I manually restart my server
However if I leave my server running for a few hours, the socket inevitably closes and reconnecting fails giving "OOM"
Can you tell me how I can auto-reconnect when a socket is closed? It's strange to me that it works when I manually restart the server but not when it happens after leaving it running.