Closed tedenda closed 3 years ago
@JohnySeven I have seen that you have done other network related things for SensESP. Can this be something that you can look into?
@tedenda I've seen case when device got correct token, but WiFi got disconnected and in that case device will reset token and it will start over. I have something already in place at here: https://github.com/JohnySeven/SensESP/tree/networking_changes But it needs testing before I can make it to PR
@JohnySeven Sounds good! Please let me know if I can help you test it!
The only example I ever saw of this was when the token was REALLY long, because there were a LOT of paths set up in SensESP. I wonder if this is the limitation: line 340 of ws_client.cpp: DynamicJsonDocument doc(1024);
. So I changed that 1024
to 100
and added code to display the name of any DeserializationError, and sure enough, I got this:
␛[0m(W) (poll_access_request) WARNING: Could not deserialize http payload.
␛[0m(W) (poll_access_request) Error: NoMemory
So I'm doing a PR to add that DeserializationError display, and the next time this problem occurs, maybe we can resolve it.
@RBerliner posted this issue in #sensors on Slack:
I am attempting to compile a project for a featheresp32 device. I can compile and upload without problem.
After the device is configured with the captive web portal, it requests a security token from the SignalK server. When that is provided, I get an error as shown below and it goes back to requesting another security token.
D) (test_token)(C1) End of payload output
(D) (test_token)(C1) Attempting to connect to SignalK Websocket...
[E][WiFiClient.cpp:482] flush(): fail on fd 58, errno: 11, "No more processes"
(W) (on_disconnected)(C1) Bad access token detected. Setting token to null.
(D) (connect)(C1) Initiating connection
(D) (connect)(C1) Websocket is connecting to SignalK server on address 192.168.0.1:3000
(D) (connect)(C1) No prior authorization token present.
(D) (send_access_request)(C1) Preparing a new access request
What are the few lines BEFORE the section of the Serial Output that you copied here? Specifically, I'm looking for the line that contains the phrase ""Testing resulted in http status".
From Slack/sensors: Featheresp32_error.txt
The issue in here is that code is checking HTTP result and if the result isn't 200 OK it will clear the token - this is wrong of course, because server could be down, there could be issue with memory, etc. No reason to clear the token. In my branch this is fixed. I'll put SK watch project on hold for now and I'll switch to SensESP to finish what I've started. Please be patient with me :-) Thanks guys!
I posted the additional debug info on Slack but maybe I should have posted it here instead. You guys really jumped on this - thanks!
@JohnySeven - if the result is not 200, and it clears the token, then it just makes a new one and sends another request, right? If that happened because of a momentary network problem, then the new access request will be approved, and everything goes ahead as it should, right? I agree that that's not the right way to handle it, because now you have to approve another access request. However, will that cause the specific problem that @RBerliner is reporting? I've had a lot of network issues from time to time, and I don't recall ever seeing his problem before.
@ba58smith new goodies that I have in my branch will help @RBerliner.
I've added new mechanism that will not clear the token if:
Also the version in my branch doesn't restart ESP on WiFi loss. It prints our WiFi connection loss reason and tries to reconnect back without restarting - this is big change and needs to be tested propertly. So far it survived disabling the Wifi AP, placing firewall in the way and other nasty things I've come up with.
@ba58smith or @tedenda are both welcome to join the ride (see below) :-)
@RBerliner could you please run one more test, but with my branch instead of current SensESP?
Just comment out SensESP in lib_deps and add "https://github.com/JohnySeven/SensESP#networking_changes" in your platformio.ini file and post the log here please so I can make sure that mechanism is working correctly.
Thanks!
@JohnySeven: Thank you for the help. I modified platformio.ini as you requested, deleted SensESP from .pio/libdeps/featheresp32, did a Clean then a Build, then an Upload and Monitor. I do not get the same error but it seems to hang on connect. I attach the Monitor output.
I can use your code for some of my test sensors and do some tests. @JohnySeven what is your name on Slack?
@tedenda my name on Slack is Jan Dytrych :-)
@JohnySeven have you seen this discussion that is going on right now about this issue? https://signalk-dev.slack.com/archives/C9B4Y02EN/p1604340599151900
@JohnySeven I tried to use your code but I get errors, is that fork up to date with master? Build error gist
@tedenda branch is 4 commits ahead, 15 commits behind SignalK:master.
I don't know what is failing, I have deleted .pio folder, so all dependencies is downloaded fresh.
The discussion of this connect/reconnect problem is being carried on here and on Slack. If you are interested in this problem, it would be worthwhile to take a look at
https://signalk-dev.slack.com/archives/C9B4Y02EN/p1604340599151900
Due to suggestions from @Butch, I think some things have been learned about this problem.
also bitten by this... been approving repeated access requests as a workaround. unable to see the slack link. how can i help?
I found this yesterday also. I happened to be looking at my log when it happened. It seems to be related to intermittent network issues. I think SensESP gives up on its token too soon. Here it is the log for reference:
things going good here...
␛[0m(D) (get_delta) SKDelta::get_delta: {"updates":[{"source":{"label":"sk-engine-lights"},"values":[{"path":"electrical.switches.lights.engineroom.state",␛[0m"value":false}]}]}
␛[0m(D) (get_delta) SKDelta::get_delta: {"updates":[{"source":{"label":"sk-engine-lights"},"values":[{"path":"electrical.switches.lights.engineroom.state",␛[0m"value":false}]}]}
suddenly loses network connection, then regains:
␛[0m(D) (connect) Initiating connection
␛[0m(D) (connect) Websocket is connecting to Signal K server on address 192.168.0.147:80
␛[0m(D) (test_token) Testing token with url http://192.168.0.147:80/signalk/v1/api/
␛[0m(D) (test_token) Testing resulted in http status 200
␛[0m(D) (test_token) Returned payload (length 5368) is:
␛[0m(D) (test_token) {"vessels":{"urn:mrn:imo:mmsi:367560820":{"mmsi":"367560820","name":"eSea Street","navigation":{"position":{"meta":{"description":"The␛[0m position of the vessel in 2 or 3 dimensions (WGS84 datum)","properties":{"longitude":
<more SignalK data>
{"label":"ws.1bbe1d64-a6ca-4464-8c22-6d3f05b5197c","timestamp":"2020-12-08T22:05:42.212Z"}}}
␛[0m(D) (test_token) End of payload output
␛[0m(D) (test_token) Attempting to connect to Signal K Websocket...
Here is where it loses the token:
␛[0m(W) (on_disconnected) Bad access token detected. Setting token to null.
␛[0m(D) (connect) Initiating connection
␛[0m(D) (connect) Websocket is connecting to Signal K server on address 192.168.0.147:80
␛[0m(D) (connect) No prior authorization token present.
␛[0m(D) (send_access_request) Preparing a new access request
␛[0m(W) (send_access_request) Can't handle response -1 to access request.
␛[0m(D) (send_access_request)
␛[0m(D) (connect) Initiating connection
␛[0m(D) (connect) Websocket is connecting to Signal K server on address 192.168.0.147:80
␛[0m(D) (connect) No prior authorization token present.
␛[0m(D) (send_access_request) Preparing a new access request
␛[0m(D) (send_access_request) Polling /signalk/v1/requests/5cfa8dde-8b10-48ff-81de-d1cda0744624 in 5 seconds
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
␛[0m(D) (poll_access_request) PENDING
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
␛[0m(W) (poll_access_request) Can't handle response -1 to pending access request.
␛[0m(W)
␛[0m(D) (connect) Initiating connection
␛[0m(D) (connect) Websocket is connecting to Signal K server on address 192.168.0.147:80
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
␛[0m(W) (poll_access_request) Can't handle response -1 to pending access request.
␛[0m(W)
␛[0m(D) (connect) Initiating connection
␛[0m(D) (connect) Websocket is connecting to Signal K server on address 192.168.0.147:80
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
␛[0m(D) (poll_access_request) PENDING
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
␛[0m(W) (poll_access_request) Can't handle response -1 to pending access request.
␛[0m(W)
␛[0m(D) (connect) Initiating connection
␛[0m(D) (connect) Websocket is connecting to Signal K server on address 192.168.0.147:80
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
␛[0m(D) (poll_access_request) PENDING
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
␛[0m(W) (poll_access_request) Can't handle response -11 to pending access request.
␛[0m(W)
␛[0m(D) (connect) Initiating connection
␛[0m(D) (connect) Websocket is connecting to Signal K server on address 192.168.0.147:80
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
␛[0m(D) (poll_access_request) PENDING
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
␛[0m(D) (poll_access_request) PENDING
␛[0m(D) (poll_access_request) Polling SK Server for authentication token
and now its waiting for authorization.
I see @JohnySeven has actually fixed this back in Oct. Is there a reason to not do a PR for this problem only?
also bitten by this... been approving repeated access requests as a workaround. unable to see the slack link. how can i help?
@kreucher , that Slack conversation was some time ago, and has "fallen off". (Our free version of Slack holds onto things for only so long before disappearing.)
There have been some reported issues via Slack where SensESP have got successful authorization from Signal K Server but fails when reconnecting. SensESP output "Bad access token detected". To figure this out more debug messages is probably needed.