nicolasff / webdis

A Redis HTTP interface with JSON output
https://webd.is
BSD 2-Clause "Simplified" License
2.82k stars 307 forks source link

Read all available data in http_client_read #145

Closed mdmoss closed 1 year ago

mdmoss commented 6 years ago

When reading packets larger than the buffer size (currently 4096 bytes) http_client_read currently leaves bytes unread. This causes connections to hang, as Webdis waits for the rest of the message in the next packet that's never coming.

The fix: until there's nothing left to read, keep reading. The error condition now has to check explicitly for EWOULDBLOCK, which indicates there's nothing left to read.

nicolasff commented 2 years ago

Hi @mdmoss,

This issue was eventually addressed by PR #195 and verified to be addressed during the discussion on #197. Unfortunately the patch as submitted in this PR – while correct in itself – leads to every single request hanging when webdis was run on macOS. This was because this code assumes the client socket is non-blocking, which is necessary for read(2) to return EWOULDBLOCK. While this was the case on Linux, the ioctl(2) call made in server.c to mark the client socket as non-blocking immediately after receiving it from accept(2) was written incorrectly. It read:

    char on = 1;
    // [...]
    ioctl(client_fd, (int)FIONBIO, (char *)&on);

instead of the expected:

    int on = 1;
    // [...]
    ioctl(client_fd, FIONBIO, &on);

The cast to int when it was supposed to be an unsigned long was enough to make it fail on macOS (while still succeeding on Linux), which meant the client socket was never made non-blocking so in combination with your patch this caused the second read(2) to hang if no data was available (which is the case in most requests since pretty much all GETs will be under 4 KB).

When this PR was filed I do remember trying out the change and not getting very far with all tests failing immediately on my Mac. I should have investigated further but I know for a fact that this was a busy work period and lost track of it. I didn't realize that this was a serious issue until #197 was filed with a clear reproduction case using Docker, and you'll see from this conversation the work that went into investigating it and evaluating this very patch.

Thanks @mdmoss for taking the time to investigate this issue, write this patch, and submit it. I'm sorry I didn't look into it more closely at the time.

As of yesterday, webdis 0.1.16 is out with a fix for this issue. A test was also added in #195 that exercises this exact case to avoid future regressions, uploading a 10,000-byte file using a PUT request. The ioctl(2) call was fixed in 950c5153d7f351e095127937a931c57b7e1bdfab.

mdmoss commented 1 year ago

Hi @nicolasff,

I'd somewhat given up on this PR, and missed the notification from your response - this is the first time I've checked back in a long while.

Thanks for trying it out, and for your clear explanation of where things went wrong. Lesson learned for next time (for me): include a failing test case to make the need obvious. Don't stress the timeline, I'm just glad to know it was useful in the end.

I'll close this PR for clarity.

Thanks for your continued stewardship of Webdis!