edicl / drakma

HTTP client written in Common Lisp
http://edicl.github.io/drakma/
249 stars 58 forks source link

unidentified bug causing short read of chunked data #108

Open ghost opened 3 years ago

ghost commented 3 years ago

The problem: (drakma:http-request "http://digidb.io/digimon-list/") got an input-chunking-unexpected-end-of-file error.

Details: After I poking around the code, I find that it is indeed a short read of (read-sequence input-buffer inner-stream :start 0 :end chunk-size) in chunga::fill-buffer. (https://github.com/edicl/chunga/blob/cb333cdba178e99b03fa60e2caa8c5d3654201d8/input.lisp#L145)

The number of bytes read is consistent most of time, but sometimes it might vary.

The bytes read is valid, and another (read-sequence big-buffer chunga::inner-stream) in the debugger can successfully read the following content left in the stream.

How to reproduce: This error is only reproducible on Android, where only ecl is available. Can both be reproduced in cl-repl app (which is available in Play Store), and ecl compiled in Termux (a terminal for android). This error is not reproducible by curl or other tools on the same device.

This error is only observed with remote server - but I only tested with "http://digidb.io/digimon-list/" so I am uncertain about other servers. On the other hand, replaying the response on localhost does not reproduce the error.

This error is only reproducible when using drakma:http-request without :want-stream t. Even if use want-stream, 1) use read-sequence directly with the returned stream 2) use drakma:read-body on returned http-stream, or 3) use drakma::%read-body on the returned stream does not reproduce the error (and the request is successful).

stassats commented 3 years ago

The android requirement makes this harder to diagnose (don't have an android device).

ghost commented 3 years ago

Even though I have one, the diagnose process so far is really painful.

I am currently run out of idea. I cannot think a single reason why drakma:read-body with :want-stream work, but directly use of drakma:http-request don't.

Do you have any suggestion how can I further investigate this problem?

Thanks

edit: wording

stassats commented 3 years ago

Why does it do only one read-sequence and then give up?

ghost commented 3 years ago

chunga::fill-buffer reads the whole chunk, which is very big (bigger than +buffer-size+) in a single run and compare it with chunk-size. That single read-sequence, for some reason, does not read all of the chunk, so it caused an error.

On Thu, Dec 10, 2020, 2:47 AM Stas Boukarev notifications@github.com wrote:

Why does it do only one read-sequence and then give up?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/edicl/drakma/issues/108#issuecomment-741973904, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIAYXPUAMX4HAJFZ2QEGNEDST7A4BANCNFSM4UT276EA .

stassats commented 3 years ago

read() is not guaranteed to read everything up to EOF. I'll make change to chunga for you to try.

stassats commented 3 years ago

Although I'm not sure how READ-SEQUENCE is implemented on ECL.

stassats commented 3 years ago

Can you try https://github.com/edicl/chunga/commit/e83c804fc22db666870d05dd38b9b5bd37a2558f

And I hope I got the logic right, as I don't have anything to test it on.

ghost commented 3 years ago

Yes, it works.

I am still curious about why use of :want-stream nil + read-body does not trigger the bug, though.

Thanks for the fix.

On Thu, Dec 10, 2020, 3:40 AM Stas Boukarev notifications@github.com wrote:

Can you try edicl/chunga@e83c804 https://github.com/edicl/chunga/commit/e83c804fc22db666870d05dd38b9b5bd37a2558f

And I hope I got the logic right, as I don't have anything to test it on.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/edicl/drakma/issues/108#issuecomment-742003152, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIAYXPXJNL3BYEULGYP2GSDST7HBFANCNFSM4UT276EA .

stassats commented 3 years ago

I'm not sure it's the right fix, maybe ECL implements read-sequence incorrectly. CLHS says that read-sequence returns fewer elements only upon reaching EOF.

ghost commented 3 years ago

Yeah. I doubt that too.

However, with almost the same call stack, directly calling read-body with http-stream obtained from returned values from http-request with (:want-stream t) seems to be able to use the old code (read-sequence) successfully.

On Thu, Dec 10, 2020, 11:02 AM Stas Boukarev notifications@github.com wrote:

I'm not sure it's the right fix, maybe ECL implements read-sequence incorrectly. CLHS says that read-sequence returns fewer elements only upon reaching EOF.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/edicl/drakma/issues/108#issuecomment-742204822, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIAYXPXLYGZVQVAR6U5K2KDSUA24PANCNFSM4UT276EA .