Fixed parsing of data that is Transfer-Encoding chunked

jmagic commented 1 year ago

Hi Dave,

Here are the changes for chunked encoding.

Thanks, Jim

DavidVine commented 1 year ago

Thanks Jim.

I realised after we made those changes the other day that the httpParseResponse function is not the right place to be putting these changes.

What we ran into with multi-chunked data in the body wasn't actually a bug. The httpParseResponse function is working correctly. It's job is to take HTTP responses in string form and build out a httpResponse object which represents the HTTP response as it is.

In the case of passing a string that contains a HTTP response with multi-chunk data in the body the httpResponse object parameter ends up with a .body member containing the multi-chunk data.

The correct thing to do (I feel) would then be to first parse the incoming HTTP string into a httpResponse object using httpParseResponse and then check if the httpResponse object contains a 'Transfer-Encoding' header with the value 'chunked'. If so, the data in the body can be de-chunked.

I'm going to add a couple of extra methods to the http.axi include file to make this process easier such as a function to quickly check if the data is chunked and unchunk the data using the code we created the other day.

Thanks for your help with this!

jmagic commented 1 year ago

Hi Dave,

Sounds good. I disagree that the httpParseResponse function is working correctly. It currently removes the end of the chunked message from the body at line 529. This means even if you wanted to handle de-chunking yourself, the end of the chunk would be removed. I would recommend the current chunk handling code be removed.

Thanks, Jim M

DavidVine commented 1 year ago

Do you mean this line:

tempResponse.body = remove_string(workingBuffer,"$0D,$0A,'0',$0D,$0A,$0D,$0A",1);

This isn't just removing the "$0d,$0a,'0',$0d,$0a,$0d,$0a" part from workingBuffer.

It's removing everything in workingBuffer starting from index 1 up to and including "$0d,$0a,'0',$0d,$0a,$0d,$0a" and assigning everything that was removed to tempResponse.body. So this should be assigning all the chunked data and the trailing "$0d,$0a,'0',$0d,$0a,$0d,$0a" to tempResponse.body.

I will test it to make sure but it looks right to me.

DavidVine commented 1 year ago

Hi Jim,

I disagree that the httpParseResponse function is working correctly. It currently removes the end of the chunked message from the body at line 529.

I have tested the httpParseResponse function and confirmed that it does not remove the end of the chunked message from the body.

For example, this is the output when calling httpResponseToDebug on the HttpResponse structure after parsing a HTTP response containing chunked-encoded data with httpParseResponse:

----- BEGIN HTTP RESPONSE -----
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: text/plain
Connection: Closed

23$0D$0AThis is the first chunk$0D$0A24$0D$0AThis is the second chunk$0D$0A0$0D$0A$0D$0A
----- END HTTP RESPONSE -----

You can see that the body of the parsed HTTP response contains the 2 x chunks and their preceding lengths 23$0D$0AThis is the first chunk$0D$0A and 24$0D$0AThis is the second chunk$0D$0A followed by the terminating chunk 0$0D$0A and then the final carriage return line feed $0D$0A.

DavidVine / amx-util-library

Fixed parsing of data that is Transfer-Encoding chunked #7