nodeca / probe-image-size

Get image size without full download. Supported image types: JPG, GIF, PNG, WebP, BMP, TIFF, SVG, PSD, ICO.
MIT License
978 stars 77 forks source link

Unpredictable socket hang ups with status code ECONNRESET #46

Closed styxlab closed 3 years ago

styxlab commented 3 years ago

Description

Successively probing the image size of a remote image always leads to a socket hang up after a couple of minutes. The number of probes it takes until the socket hang up occurs vary, but it always occurs in my test if I just wait long enough. Sometimes it takes just a hand-full iterations, other times it takes several hundred of iterations. Both the client and the server seem to be waiting until the client gives up with the following error message:

Error: socket hang up
    at connResetException (node:internal/errors:629:14)
    at TLSSocket.socketCloseListener (node:_http_client:451:25)
    at TLSSocket.emit (node:events:341:22)
    at node:net:655:12
    at TCP.done (node:_tls_wrap:564:7) {
  code: 'ECONNRESET'

The server is serving a static file with nginx and the logs do not show anything peculiar:

2020/11/22 00:06:55 [warn] 213383#0: *1712 an upstream response is buffered to a temporary file /var/lib/nginx/tmp/proxy/5/85/0000000855 while reading upstream, client: 37.201.193.119, server: static.gotsby.org, request: "GET /v1/assets/images/creating-a-custom-theme.png HTTP/1.1", upstream: "http://127.0.0.1:7000/v1/assets/images/creating-a-custom-theme.png", host: "static.gotsby.org"
2020/11/22 00:06:55 [info] 213383#0: *1712 client prematurely closed connection (104: Connection reset by peer) while sending to client, client: 37.201.193.119, server: static.gotsby.org, request: "GET /v1/assets/images/creating-a-custom-theme.png HTTP/1.1", upstream: "http://127.0.0.1:7000/v1/assets/images/creating-a-custom-theme.png", host: "static.gotsby.org"
2020/11/22 00:06:55 [warn] 213383#0: *1714 an upstream response is buffered to a temporary file /var/lib/nginx/tmp/proxy/6/85/0000000856 while reading upstream, client: 37.201.193.119, server: static.gotsby.org, request: "GET /v1/assets/images/creating-a-custom-theme.png HTTP/1.1", upstream: "http://127.0.0.1:7000/v1/assets/images/creating-a-custom-theme.png", host: "static.gotsby.org"
2020/11/22 00:06:55 [info] 213383#0: *1714 client prematurely closed connection (32: Broken pipe) while sending to client, client: 37.201.193.119, server: static.gotsby.org, request: "GET /v1/assets/images/creating-a-custom-theme.png HTTP/1.1", upstream: "http://127.0.0.1:7000/v1/assets/images/creating-a-custom-theme.png", host: "static.gotsby.org"

The client error comes 60 seconds after the last server log message. Note that the error always occurs on a "(32: Broken pipe)" while I also see "(104: Connection reset by peer)". However, the broken pipe message also appears in cases that do not lead to a socket hang up.

How to reproduce

git clone https://github.com/styxlab/image-probe-test.git
cd image-probe-test
yarn
node test

This will run an infinite loop probing an image on a remote server that I control. The test can be changed to probe an image on localhost. However, the issue did not occur when testing on localhost.

Other info

I checked that the firewall is not getting in the way by disabling it completely.

Client Environment

Linux home 5.9.8-100.fc32.x86_64 node v15.2.1

Thanks for this great repo! -- styxlab

puzrin commented 3 years ago

The client error comes 60 seconds after the last server log message

See https://github.com/nodeca/probe-image-size/blob/master/http.js#L15-L16

This will run an infinite loop probing an image on a remote server that I control. The test can be changed to probe an image on localhost. However, the issue did not occur when testing on localhost.

What's the problem then? You can NOT rely on network stability. Sometime shit happens. This package does optimal thing - timeout, instead of hang forever.

  1. You can try previous major releases, based on request or got, but i don't beleive it helps.
  2. You can write you own download wrapper with any other agent (but i think needle is ok).

IMO nothing to fix yet. Such things are usually solved by repeating request.

styxlab commented 3 years ago

@puzrin : Thanks so much for your prompt replies and your insights and suggestions. I agree that there is nothing to fix yet - though I am still puzzled by the fact that the connection breaks are so frequent. A connection break every several 100 image probes is too high (one fail every 1 million would be acceptable) and I do wonder if it could have to do something with the way the stream is terminated. Yes, retries can be done, but this is only an option if the general stability of the method is acceptable.

These types of issues are really hard to debug, and I do not know how far I will get with this. What I will try is testing the streams on the same image with some other packages as you already suggested. In any case, thanks for looking into this!

puzrin commented 3 years ago

Let's close then (but we still can continue and reopen if new info appears)

A connection break every several 100 image probes is too high (one fail every 1 million would be acceptable) and I do wonder if it could have to do something with the way the stream is terminated.

That's unpleasant, but sill possible. For example, when i transfer 100GB archive between hetzner servers by FTP - it's always broken. I have to use ssh. Note, that's internal 1gbps network of good quality.

These types of issues are really hard to debug, and I do not know how far I will get with this. What I will try is testing the streams on the same image with some other packages as you already suggested. In any case, thanks for looking into this!

This is done by narrow down location of bug. Problem may be with server setup, intermediate path nework, your local nework, this pakage an so on.

  1. Try different server AND different client hosts (with different net works/providers)
  2. Try different wrappers (your own or pevious version of this package), bt i don't belive this helps.

My point is, i can ivestigate problem more deep only if your demo fails often with local sever. But as you said, local testing is ok.

styxlab commented 3 years ago

Identified my ISP to be responsible for connection drops, so this is not a problem of this package. Thanks a gain for looking into this issue.