toblum / ESPTeamsPresence

Microsoft Teams presence light for ESP32
https://toblum.github.io/ESPTeamsPresence/
Mozilla Public License 2.0
127 stars 37 forks source link

deserializeJson() failed: InvalidInput #14

Closed basiccable4 closed 3 years ago

basiccable4 commented 3 years ago

The fetching of the my status will fail periodically and for several minutes at a time. I have the latest release (0.17).

[15:08:35]Polling presence failed, retry #0. [15:09:05]Polling presence info ... [15:09:05][HTTPS] Auth token valid for 3567 s. [15:09:07][HTTPS] Method: GET, Response code: 200 [15:09:07]deserializeJson() failed: InvalidInput [15:09:07]--> Availability: , Activity: [15:09:07] [15:09:07]Polling presence failed, retry #0. [15:09:37]Polling presence info ... [15:09:37][HTTPS] Auth token valid for 3536 s. [15:09:38][HTTPS] Method: GET, Response code: 200 [15:09:38]deserializeJson() failed: InvalidInput [15:09:38]--> Availability: , Activity: [15:09:38] [15:09:38]Polling presence failed, retry #0. [15:10:08]Polling presence info ... [15:10:08][HTTPS] Auth token valid for 3504 s. [15:10:09][HTTPS] Method: GET, Response code: 200 [15:10:09]setAnimation: 0, 0-32, Mode: 2, Color: 16711680, Speed: 3000 [15:10:09]--> Availability: Busy, Activity: InACall

toblum commented 3 years ago

Hi @basiccable4,

this happens when the Graph API returns something that is not parseable as JSON. This could be due to network issues, problems on MS Graph side or because the message returned by Graph is diferent from what is expected.

I didn't experience this for quite some time. There were issues in the past because MSFT changed the contents of the reponse and they were exceeding the maximum length. But this was changed a few versions ago.

It's hard to find out what happens when you can't reproduce it by yourself.

When did that behaviour start and does it happen only in some cases / states?

Greetings Tobias

basiccable4 commented 3 years ago

Hi Tobias, It has sporadically occurred since I set it up - I’ve had it happen across two different ESP32 devices on my network. First device was an ESP32-DevkitC with a Neopixel ring. Current device is the Adafruit Huzzah ESP32 with a Feathertop attached.

it’s very rare that it will pull my status correctly on the first polling after a change. Typically it will take a few attempts to pull the new status.

Sometimes it will flash red for up to 20-30 minutes until it finally pulls the status.

I know that the graph API is prone to being hot garbage, so I understand if there’s nothing we can really do here.

Is there additional logging or some sort of debug mode that I can enable to see the entire response from Microsoft?

toblum commented 3 years ago

Hi @basiccable4,

I will at least keep an eye on it and see if I can print out some more informations in that case that will give us a clue what happens.

Greetings

MegaWatt71 commented 3 years ago

Hi @toblum , @basiccable4

I faced the same issue and since 2 days it seems to be permanent. Did a bit of rough debugging and saw that MS Graph indeed sent some additional stuff outside the brackets.

To work around this I just deleted everything in the answer, what is before the first opening bracket and after the last closing bracket(in request_handler.h)

This patch works for me (sorry not nice, but my coding time is ages ago): if (httpCode > 0) { // HTTP header has been send and Server response header has been handled Serial.printf("[HTTPS] Method: %s, Response code: %d\n", type.c_str(), httpCode);

         // Just for debugging purposes:
         // if (url.indexOf("presence") > 0) {
          //    Serial.println(client->readString());
         // }

         // File found at server (HTTP 200, 301), or HTTP 400 with response payload
         if (httpCode == HTTP_CODE_OK || httpCode == HTTP_CODE_MOVED_PERMANENTLY || httpCode == HTTP_CODE_BAD_REQUEST) {
             // Parse JSON data

++ String Myanswer = client->readString(); ++ int firstOpeningBracket = Myanswer.indexOf('{'); ++ int lastClosingBracket = Myanswer.lastIndexOf('}'); ++ String Mycopy = Myanswer.substring (firstOpeningBracket ,lastClosingBracket +1); // Parse JSON data -- DeserializationError error = deserializeJson(doc, *client); ++ DeserializationError error = deserializeJson(doc, Mycopy);

            client->stop();
             delete client;
             client = NULL;

             if (error) {
MegaWatt71 commented 3 years ago

PS: Might be, that this is something what happens, if you have more than one busy lights for the same account?! I've not seen MS Graph answering with additional crap, when there is only one authorization. Unfortunately I could not retest, because I'm running out of MS accounts ;-) Just revoking the app authority in MS account settings and re-authenticate only one account did not change anything.

Just guessing...

toblum commented 3 years ago

Hi @MegaWatt71,

thank you for your feedback. I experienced this today the first time, no problems before. Today it stopped working also here. So it must be something that was rolled out over the last days over the different servers,

For me the result from the client looks like this: d2 {"@odata.context":"https://graph.microsoft.com/v1.0/$metadata#users('abc')/presence/$entity","id":"abc","availability":"Away","activity":"Away"} 0 This happens also for other Graph calls. I have currently no idea where this originates from. I suspect that it has to do with the deprecation of some older SSL/TLS protocols and the move to newer alternatives. I've seen this causing this trouble in different places over the last few days. Maybe the very basic SSL/TLS implementation in the ESP isn't able to handle this new encryption method well. But I may be wrong with that.

Maybe trying to remove the junk is a solution for the meantime. I'll look into this. Help always appreciated.

Greetings Tobias

MegaWatt71 commented 3 years ago

Hallo Tobias,

yepp, I see exactly the same in the answer from graph. I did a logging over night and the junk changes over time. I see: d2 {"@odata.context":"https://graph.microsoft.com/v1.0/$metadata...:"Away"} 0 after some hours: d5 {"@odata.context":"https://graph.microsoft.com/v1.0/$metadata...:"Away"} 0 or dc {"@odata.context":"https://graph.microsoft.com/v1.0/$metadata...:"Away"} 0 or d8 {"@odata.context":"https://graph.microsoft.com/v1.0/$metadata...:"Away"} 0

But the good thing is, that all the junk was successfully thrown out and the system worked stable. Unfortunately I do not know enough about the SSL implementation in ESP to be of great help here... I will continue to log over the weekend and will let you know if there were any issues with my "interims junk removal" patch.

PS: I still love this project! It is of such great help for me and my family! Since I have the busy light we had no escalations about the noise level when I'm in Meetings!

clmattm commented 3 years ago

@MegaWatt71

I made the changes to request_handler.h that you specified and so far, so good. Thanks for the workaround!

toblum commented 3 years ago

Hi all,

this took me quite long to figure out and I needed to go completely down the HTTP stack and use fiddler to find at least out what triggers this problem and why it works in the browser:
It has to do with HTTP compression. When the ESP32 sends a request via HttpClient, it doesn't send a "Accept-Encoding" header. That means that the server shouldn't do any encoding/compression of the result. In that case the Graph API returns that extra garbage around the JSON. When the browser sends the request, it adds the "Accept-encoding: gzip, ..." header which makes the Graph API return the result gzipped and this data is OK.

It's easy to reproduce with curl, so it has nothing directly to do with the ESP: image

And compressed:
image

This looks like a malfunction on Graph side to me. Correct me, if I'm wrong, but the result shouldn't contain these characters even if no encoding is requested. I'm unsure where I could ask for help with this. It seems that there were similar problems in the past.

In the meanwhile I'll see if I can use gzip compression with the ESP32 (unsure if that works well) or implement a workaround like yours (but that doubles the amount of memory needed to precess the result).

Greetings Tobias

toblum commented 3 years ago

Hi guys,

I must correct myself, I was almost completely on the wrong track.

The supposed garbage is actually Chunked transfer encoding and a feature of HTTP 1.1. The first number here indicates the length of the chunk, which is why it changes. The feature allows servers to deliver content in multiple chunks without knowing in advance how large it will end up. This is one of those network layers that I rarely deal with, which is why I didn't recognize it right away. Someone else will say: That's obvious! :-)

So the Graph API works absolutely correctly, they just recently started using this feature.

In fact, all clients that want to support HTTP 1.1 must also support chunked transfer encoding. Unfortunately the HTTP library of the ESP32 does not. This also leads directly to the solution: Using HTTP 1.0

In my tests this worked fine and I will publish an updated release tonight. Maybe you can test it as well.

Many greetings Tobias

toblum commented 3 years ago

Just released a version with the fix (switch to HTTP/1.0) and updated dependencies. Worked for me quite well today. If someone of you likes to test, please let me know if it's the same for you.

Greetings

MegaWatt71 commented 3 years ago

Perfect! Thanks Tobias! I just updated my lights. All started ok. I'll log over night to see if there are issues.

BR Andreas