application-research / estuary

A custom IPFS/Filecoin node that makes it easy to pin IPFS content and make Filecoin deals.
https://docs.estuary.tech
Other
239 stars 67 forks source link

error reading from http response stream: unexpected EOF #391

Open stuberman opened 2 years ago

stuberman commented 2 years ago

As an SP receiving deals from Estuary I have been seeing many of these errors in the last few days.

Daemon: 1.17.0+mainnet+git.2830429ad+api1.5.0 Local: lotus-miner version 1.17.0+mainnet+git.2830429ad boostd version 1.3.0-rc1+git.b85d1b9

Field Data
Deal af0d5ce2-197f-4a79-8509-ba95ad9fe688 CreatedAt2022-08-05 06:42:34.742 (1h ago)
Client Address f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a
Client Peer ID 12D3KooWCVXs8P7iq6ao4XhfAmKWrEeuKFWCJgqe9jGDMTqHYBjw
Signed Proposal CID bafyreiavhrysew6dglbaqfftvhhaxxusoreqis432e4dhke6o7ivzmccw4
Deal Data Root CID bafybeicomvq3irujc5uq5pgt4ryssbyxwzsqbjmf7dmhkoinb2i453wig4
Verified Yes
Piece CID baga6ea4seaqnqimy2ett52jhmy7iwaaj35b4u2neh5s2nhdfh7kwza3tytcjgji
Piece Size 1 GiB (1,073,741,824 bytes)
Client Collateral 0 atto
Provider Collateral 223,966.4 nano
Storage Price / epoch / GiB 0 atto
Current Epoch 2,046,458
Start Epoch 2,066,485 (in 7 days)
End Epoch 3,561,205 (in a year)
Duration 1,494,720 (518 days)
Storage Fee 0 atto (Price per epoch x Piece Size x Duration)
Transfer Mode Online
Transfer Type libp2p
Transfer Size 1004.3 MiB (1,053,033,587 bytes)
Transfer Params URL: libp2p:///ip4/147.28.156.33/udp/6746/quic/p2p/12D3KooWHQyvwRKxALCRZ6LB47grReJV8stVqhfHvJt1uM1Gmysn
Transferred 59 B (59 bytes)
Inbound File Path /market/.boost/incoming/af0d5ce2-197f-4a79-8509-ba95ad9fe688.download
Checkpoint Accepted (1h ago)
Status Paused at 'Transferring 0%': data-transfer failed: could not finish transfer even after 15 attempts, lastErr: error reading from http response stream: unexpected EOF

Deal Logs

2022-08-05 06:42:35.055 26ms
http-transport: started async http transfer 2022-08-05 06:42:35.055 0ms http-transport: sending http request range-rq: bytes=0- received: 0 remaining: 1,053,033,587 2022-08-05 06:42:35.372 317ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:42:35.401 29ms
http-transport: some data was transferred before connection error, so resetting backoff to zero transferred: 59 2022-08-05 06:42:35.432 31ms
http-transport: backing off before retrying http request attempts: 1 backoff time: 5s 2022-08-05 06:42:40.432 5s
http-transport: back-off complete, retrying http request backoff time: 5s 2022-08-05 06:42:40.464 32ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:42:40.758 294ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:42:40.784 26ms
http-transport: backing off before retrying http request attempts: 2 backoff time: 6.245044185s 2022-08-05 06:42:47.029 6s
http-transport: back-off complete, retrying http request backoff time: 6.245044185s 2022-08-05 06:42:47.054 25ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:42:47.335 281ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:42:47.367 32ms
http-transport: backing off before retrying http request attempts: 3 backoff time: 6.75590431s 2022-08-05 06:42:54.123 7s
http-transport: back-off complete, retrying http request backoff time: 6.75590431s 2022-08-05 06:42:54.149 26ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:42:54.446 297ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:42:54.471 25ms
http-transport: backing off before retrying http request attempts: 4 backoff time: 10.083686175s 2022-08-05 06:43:04.555 10s http-transport: back-off complete, retrying http request backoff time: 10.083686175s 2022-08-05 06:43:04.584 29ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:43:04.874 290ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:43:04.896 22ms
http-transport: backing off before retrying http request attempts: 5 backoff time: 18.326790903s 2022-08-05 06:43:23.223 18s http-transport: back-off complete, retrying http request backoff time: 18.326790903s 2022-08-05 06:43:23.242 19ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:43:23.516 274ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:43:23.541 25ms
http-transport: backing off before retrying http request attempts: 6 backoff time: 12.242783997s 2022-08-05 06:43:35.784 12s http-transport: back-off complete, retrying http request backoff time: 12.242783997s 2022-08-05 06:43:35.811 27ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:43:36.091 280ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:43:36.132 41ms
http-transport: backing off before retrying http request attempts: 7 backoff time: 6.629038897s 2022-08-05 06:43:42.761 7s
http-transport: back-off complete, retrying http request backoff time: 6.629038897s 2022-08-05 06:43:42.788 27ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:43:43.069 281ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:43:43.095 26ms
http-transport: backing off before retrying http request attempts: 8 backoff time: 40.484467045s 2022-08-05 06:44:23.580 40s http-transport: back-off complete, retrying http request backoff time: 40.484467045s 2022-08-05 06:44:23.598 18ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:44:23.870 272ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:44:23.895 25ms
http-transport: backing off before retrying http request attempts: 9 backoff time: 1m18.024394497s 2022-08-05 06:45:41.919 1m
http-transport: back-off complete, retrying http request backoff time: 1m18.024394497s 2022-08-05 06:45:41.942 23ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:45:42.218 276ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:45:42.257 39ms
http-transport: backing off before retrying http request attempts: 10 backoff time: 2m9.253215406s 2022-08-05 06:47:51.510 2m
http-transport: back-off complete, retrying http request backoff time: 2m9.253215406s 2022-08-05 06:47:51.535 25ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:47:51.817 282ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:47:51.841 24ms
http-transport: backing off before retrying http request attempts: 11 backoff time: 1m26.265837538s 2022-08-05 06:49:18.107 1m
http-transport: back-off complete, retrying http request backoff time: 1m26.265837538s 2022-08-05 06:49:18.131 24ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:49:18.407 276ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:49:18.438 31ms
http-transport: backing off before retrying http request attempts: 12 backoff time: 1m19.917433556s 2022-08-05 06:50:38.356 1m
http-transport: back-off complete, retrying http request backoff time: 1m19.917433556s 2022-08-05 06:50:38.379 23ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:50:38.656 277ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:50:38.687 31ms
http-transport: backing off before retrying http request attempts: 13 backoff time: 3m54.631373795s 2022-08-05 06:54:33.319 4m
http-transport: back-off complete, retrying http request backoff time: 3m54.631373795s 2022-08-05 06:54:33.346 27ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 06:54:33.626 280ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 06:54:33.655 29ms
http-transport: backing off before retrying http request attempts: 14 backoff time: 7m3.939806832s 2022-08-05 07:01:37.595 7m
http-transport: back-off complete, retrying http request backoff time: 7m3.939806832s 2022-08-05 07:01:37.618 23ms
http-transport: sending http request range-rq: bytes=59- received: 59 remaining: 1,053,033,528 2022-08-05 07:01:38.401 783ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-08-05 07:01:38.427 26ms
http-transport: terminating http transfer: exhausted max attempts err: error reading from http response stream: unexpected EOF maxAttempts: 15 2022-08-05 07:01:38.453 26ms
deal paused because of recoverable error checkpoint: Accepted err: data-transfer failed: could not finish transfer even after 15 attempts, lastErr: error reading from http response stream: unexpected EOF retry: manual 2022-08-05 07:01:38.505 52ms

en0ma commented 2 years ago

@stuberman I have looked at this, This is a boost issue. I will create an issue in Boost with a possible fix.

So basically, Boost HTTP file transfer from Estuary expects every GET request to download car bytes to have a min body size of 1Mb. But that might not always be the case as we can have blocks with sizes smaller than that, in such cases, it will error with unexpected EOF when it tries to read a 1MB from a response body of 59b.

This has to be fixed in Boost; which is to add a check for unexpected EOF error and continue to read the next byte streams until EOF

SBudo commented 2 years ago

Hi @en0ma Any progress with the boost team on this? Getting loads of these errors as well :-(

en0ma commented 1 year ago

@stuberman I have looked at this, This is a boost issue. I will create an issue in Boost with a possible fix.

So basically, Boost HTTP file transfer from Estuary expects every GET request to download car bytes to have a min body size of 1Mb. But that might not always be the case as we can have blocks with sizes smaller than that, in such cases, it will error with unexpected EOF when it tries to read a 1MB from a response body of 59b.

This has to be fixed in Boost; which is to add a check for unexpected EOF error and continue to read the next byte streams until EOF

After trying to replicate this, I can say my above assumptions are not correct. The actual issue seems to come from some of the shuttles crashing during data transfer. We are working to stabilize these shuttles and mitigate this issue finally.

So sorry for the delay

jcace commented 1 year ago

Hey @en0ma , I'm starting to investigate this. Still seeing a lot of those unexpected EOF messages on estuary deals to my SP node. Any new developments on the source of this issue?

gmelodie commented 1 year ago

@whyrusleeping this might have something to do with our boost throttling issue?

TL;DR: boost does not rate limit so we end up with several concurrent deals that are slow (since there are so many at a time). None of them completes and they time out (EOF?)

jcace commented 1 year ago

@gmelodie That's part of it - we have seen some issues with this exact scenario, but in that case we usually get a context deadline exceeded error or something similar. This EOF error is caused by a deal being made referencing a CID that's not actually retrievable due to blockstore corruption issues. That should be fixed with the latest version of estuary @alvin-reyes is releasing plus the switch to FlatFS should make it more stable.

I'm working on a fix to prevent dealmaking with these unretrievable CIDs!

Aquaritek commented 1 year ago

I would just like to add my 2 cents here that I've just started getting deals from Estuary and pretty much all (99%) of the deals are erroring in this way over the last day now. image

stuberman commented 1 year ago

FYI - I am still seeing this even after upgrading to

lotus-miner version Daemon: 1.19.0-rc2+mainnet+git.2520b1644+api1.5.0 Local: lotus-miner version 1.19.0-rc2+mainnet+git.2520b1644 boostd -version boostd version 1.5.1-rc2+git.2c5604e

SBudo commented 1 year ago

Same here... 99% of all deals are error'ed with this, so I had to suspend Estuary deals for now :-(

en0ma commented 1 year ago

Sorry for joining in late.

The major causes of this issue are;

There has been major work to stabilize the infra to mitigate the crashes and this effort is looking really good, the platform is slowly getting the needed uptime thanks to @alvin-reyes and @Zorlin

As for the corrupted data, I discussed with @alvin-reyes and @jlogelin that I will implement a sanity check just before a deal is made, no need to send bad data to SPs.

jcace commented 1 year ago

We should see this resolved with today's 0.2.0 release ref: https://github.com/application-research/estuary/pull/593

Please let us know if you're still seeing this come up!

Thanks to @en0ma for this fix 😄

stuberman commented 1 year ago

Please let us know if you're still seeing this come up!

How soon should we see this take effect? Do SPs need to do anything on our end to see this work now? (I am now running boostd 1.5.1-rc2+git.b5768c8)

Aquaritek commented 1 year ago

Wondering the same as stuberman

Here's the last 8hrs of deals from Estuary for me: image

jcace commented 1 year ago

Please let us know if you're still seeing this come up!

How soon should we see this take effect? Do SPs need to do anything on our end to see this work now? (I am now running boostd 1.5.1-rc2+git.b5768c8)

Nothing to do on the SP side. It should be live now, so you should not see any more of those unexpected EOF show up in your Boost deals queue from here on out - but do let us know if you continue to see them.

stuberman commented 1 year ago

The most recent deal failed 1 hr 30 minutes ago: (time is local EST - UTC -5)

2022-11-28 15:20:22.701 | 8ms | deal failed err: data-transfer failed: could not finish transfer even after 15 attempts, lastErr: error reading from http response stream: unexpected EOF

I will continue to monitor

Aquaritek commented 1 year ago

Alright,

I just got a couple of deals sent my way after about 8hrs of silence that are both throwing an EOF error:

image

Details for one of them:

CreatedAt 2022-11-28 19:22:14.515 (20m ago) Client Address f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a Client Peer ID 12D3KooWCVXs8P7iq6ao4XhfAmKWrEeuKFWCJgqe9jGDMTqHYBjw Signed Proposal CID bafyreidwxendlxd4rkb3jyweipmniiodbqzswubxv7j4az3lfxw33s4lwq Deal Data Root CID bafy2bzaceduzqdufliy5yftuz2vyyjmof37ai25a4c4tb5d2y42so6hy7imxg Verified Yes Piece CID baga6ea4seaqpw55fa5wsr644lfpkkh2dvwirc6iwpeqxtii7fmr7jqwhfqxfcha Piece Size 16 GiB (17,179,869,184 bytes) Client Collateral 0 atto Provider Collateral 0.005 FIL Storage Price / epoch / GiB 0 atto Current Epoch 2,379,444 Start Epoch 2,399,564 (in 7 days) End Epoch 3,894,284 (in a year) Duration 1,494,720 (519 days) Storage Fee 0 atto (Price per epoch x Piece Size x Duration) Transfer Mode Online Transfer Type libp2p Transfer Size 8.8 GiB (9,437,019,293 bytes) Transfer Params URL: libp2p:///ip4/145.40.93.107/tcp/6747/ws/p2p/12D3KooWBUriTeu6YoJsuSg5gCvEecim9xKdZaW8fE54LUzmWJn7 Transferred 61 B (61 bytes) Inbound File Path /data/lotus/lotus-market/incoming/24d42133-0296-43ce-bba4-317a3c04f505.download Publish Message CID Chain Deal ID
Checkpoint Accepted (20m ago) Status Transfer stalled at 0%

2022-11-28 19:38:54.459 115ms http-transport: sending http requestrange-rq: bytes=61-received: 61remaining: 9,437,019,232 2022-11-28 19:39:25.080 |31s http-transport: http request errorhttp code: 0outputErr: error reading from http response stream: unexpected EOF 2022-11-28 19:39:25.159 79ms http-transport: backing off before retrying http requestattempts: 13backoff time: 10m0s

stuberman commented 1 year ago

I also got another failed deal about an hour ago:

Deal 3ae043b9-cce0-42ed-a4f0-3d1758bf21df
CreatedAt 2022-11-28 21:13:06.846 (1h ago)

f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a 12D3KooWCVXs8P7iq6ao4XhfAmKWrEeuKFWCJgqe9jGDMTqHYBjw bafyreieyllxrpqwww5sfym5owyqjbibzkwsvpubp3pt337bbnvacddkkqq bafy2bzaceapw2o3ptixrz4j7ilfte2uivybbp3xhdkrx77pvkabqbp5nw3zek Yes baga6ea4seaqkiwlcouzt6ishbbs6f5kj4y4vuoqp64hrkr7cfc5vdddjywxpwja 16 GiB (17,179,869,184 bytes) 0 atto 0.005 FIL 0 atto 2,379,490 2,399,546 (in 7 days) 3,894,266 (in a year) 1,494,720 (519 days) 0 atto (Price per epoch x Piece Size x Duration) Online libp2p 8.8 GiB (9,438,585,756 bytes) URL: libp2p:///ip4/147.28.156.33/udp/6746/quic/p2p/12D3KooWQt6XqZ5XMkUiSqnfekep2X5o1GQxxKkKWj2mg1QYBXco 0 B (0 bytes) /market/.boost/incoming/3ae043b9-cce0-42ed-a4f0-3d1758bf21df.download     Complete (1h ago) Error: data-transfer failed: could not finish transfer even after 15 attempts, lastErr: error reading from http response stream: unexpected EOF

Deal 3ae043b9-cce0-42ed-a4f0-3d1758bf21df CreatedAt 2022-11-28 21:13:06.846 (1h ago) Client Address f3vnq2cmwig3qjisnx5hobxvsd4drn4f54xfxnv4tciw6vnjdsf5xipgafreprh5riwmgtcirpcdmi3urbg36a Client Peer ID 12D3KooWCVXs8P7iq6ao4XhfAmKWrEeuKFWCJgqe9jGDMTqHYBjw Signed Proposal CID bafyreieyllxrpqwww5sfym5owyqjbibzkwsvpubp3pt337bbnvacddkkqq Deal Data Root CID bafy2bzaceapw2o3ptixrz4j7ilfte2uivybbp3xhdkrx77pvkabqbp5nw3zek Verified Yes Piece CID baga6ea4seaqkiwlcouzt6ishbbs6f5kj4y4vuoqp64hrkr7cfc5vdddjywxpwja Piece Size 16 GiB (17,179,869,184 bytes) Client Collateral 0 atto Provider Collateral 0.005 FIL Storage Price / epoch / GiB 0 atto Current Epoch 2,379,490 Start Epoch 2,399,546 (in 7 days) End Epoch 3,894,266 (in a year) Duration 1,494,720 (519 days) Storage Fee 0 atto (Price per epoch x Piece Size x Duration) Transfer Mode Online Transfer Type libp2p Transfer Size 8.8 GiB (9,438,585,756 bytes) Transfer Params URL: libp2p:///ip4/147.28.156.33/udp/6746/quic/p2p/12D3KooWQt6XqZ5XMkUiSqnfekep2X5o1GQxxKkKWj2mg1QYBXco Transferred 0 B (0 bytes) Inbound File Path /market/.boost/incoming/3ae043b9-cce0-42ed-a4f0-3d1758bf21df.download Publish Message CID Chain Deal ID
Checkpoint Complete (1h ago) Status Error: data-transfer failed: could not finish transfer even after 15 attempts, lastErr: error reading from http response stream: unexpected EOF Deal Logs 2022-11-28 21:13:05.703
executing deal proposal received from network peer: 12D3KooWCVXs8P7iq6ao4XhfAmKWrEeuKFWCJgqe9jGDMTqHYBjw 2022-11-28 21:13:05.773 70ms
processing deal acceptance request 2022-11-28 21:13:06.569 796ms
tagged funds for deal tagged for deal collateral: 4,581,287,544,491,617 tagged for deal publish: 150,000,000,000,000,000 total available for collateral: 25,025,243,582,403,963,701 total available for publish: 14,774,372,619,989,516,467 total tagged for collateral: 13,285,486,243,847,196 total tagged for publish: 300,000,000,000,000,000 2022-11-28 21:13:06.814 245ms
created deal download staging file path: /market/.boost/incoming/3ae043b9-cce0-42ed-a4f0-3d1758bf21df.download 2022-11-28 21:13:06.871 57ms
inserted deal into deals DB 2022-11-28 21:13:06.904 33ms
deal accepted and scheduled for execution 2022-11-28 21:13:06.905 1ms deal execution initiated deal state: 2022-11-28 21:13:06.984 79ms
size of downloaded file filepath: /market/.boost/incoming/3ae043b9-cce0-42ed-a4f0-3d1758bf21df.download size: 0 2022-11-28 21:13:07.010 26ms
deal execution in progress 2022-11-28 21:13:07.039 29ms
deal queued for transfer transfer client id: 5,343,743 2022-11-28 21:13:14.801 8s
start deal data transfer transfer client id: 5,343,743 2022-11-28 21:13:14.827 26ms
http-transport: execute transfer deal size: 9,438,585,756 output file: /market/.boost/incoming/3ae043b9-cce0-42ed-a4f0-3d1758bf21df.download time before context deadline: 23h59m59.99998934s 2022-11-28 21:13:14.843 16ms
http-transport: existing file size deal size: 9,438,585,756 file size: 0 2022-11-28 21:13:14.869 26ms
http-transport: libp2p-http url multiaddr: /ip4/147.28.156.33/udp/6746/quic peer id: 12D3KooWQt6XqZ5XMkUiSqnfekep2X5o1GQxxKkKWj2mg1QYBXco url: libp2p://12D3KooWQt6XqZ5XMkUiSqnfekep2X5o1GQxxKkKWj2mg1QYBXco 2022-11-28 21:13:14.907 38ms
http-transport: started async http transfer 2022-11-28 21:13:14.907 0ms http-transport: sending http request range-rq: bytes=0- received: 0 remaining: 9,438,585,756 2022-11-28 21:13:15.685 778ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:13:15.710 25ms
http-transport: some data was transferred before connection error, so resetting backoff to zero transferred: 61 2022-11-28 21:13:15.745 35ms
http-transport: backing off before retrying http request attempts: 1 backoff time: 5s 2022-11-28 21:13:20.745 5s
http-transport: back-off complete, retrying http request backoff time: 5s 2022-11-28 21:13:20.771 26ms
http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:13:21.042 271ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:13:21.064 22ms
http-transport: backing off before retrying http request attempts: 2 backoff time: 6.850914477s 2022-11-28 21:13:27.915 7s
http-transport: back-off complete, retrying http request backoff time: 6.850914477s 2022-11-28 21:13:27.942 27ms
http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:13:28.210 268ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:13:28.238 28ms
http-transport: backing off before retrying http request attempts: 3 backoff time: 7.800400467s 2022-11-28 21:13:36.039 8s
http-transport: back-off complete, retrying http request backoff time: 7.800400467s 2022-11-28 21:13:36.065 26ms
http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:13:36.360 295ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:13:36.385 25ms
http-transport: backing off before retrying http request attempts: 4 backoff time: 7.558646648s 2022-11-28 21:13:43.944 8s
http-transport: back-off complete, retrying http request backoff time: 7.558646648s 2022-11-28 21:13:43.966 22ms
http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:13:44.235 269ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:13:44.258 23ms
http-transport: backing off before retrying http request attempts: 5 backoff time: 25.302045433s 2022-11-28 21:14:09.560 25s http-transport: back-off complete, retrying http request backoff time: 25.302045433s 2022-11-28 21:14:09.584 24ms
http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:14:09.854 270ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:14:09.881 27ms
http-transport: backing off before retrying http request attempts: 6 backoff time: 33.22081217s 2022-11-28 21:14:43.101 33s http-transport: back-off complete, retrying http request backoff time: 33.22081217s 2022-11-28 21:14:43.128 27ms
http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:14:43.387 259ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:14:43.411 24ms
http-transport: backing off before retrying http request attempts: 7 backoff time: 29.981562545s 2022-11-28 21:15:13.393 30s http-transport: back-off complete, retrying http request backoff time: 29.981562545s 2022-11-28 21:15:13.427 34ms
http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:15:13.688 261ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:15:13.716 28ms
http-transport: backing off before retrying http request attempts: 8 backoff time: 36.083393691s 2022-11-28 21:15:49.800 36s http-transport: back-off complete, retrying http request backoff time: 36.083393691s 2022-11-28 21:15:49.818 18ms
http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:15:50.548 730ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:15:50.568 20ms
http-transport: backing off before retrying http request attempts: 9 backoff time: 1m12.076068294s 2022-11-28 21:17:02.645 1m
http-transport: back-off complete, retrying http request backoff time: 1m12.076068294s 2022-11-28 21:17:02.652 7ms http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:17:03.363 711ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:17:03.369 6ms http-transport: backing off before retrying http request attempts: 10 backoff time: 2m47.474055154s 2022-11-28 21:19:50.844 3m
http-transport: back-off complete, retrying http request backoff time: 2m47.474055154s 2022-11-28 21:19:50.851 7ms http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:19:51.545 694ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:19:51.551 6ms http-transport: backing off before retrying http request attempts: 11 backoff time: 3m52.785093975s 2022-11-28 21:23:44.336 4m
http-transport: back-off complete, retrying http request backoff time: 3m52.785093975s 2022-11-28 21:23:44.343 7ms http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:23:45.054 711ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:23:45.060 6ms http-transport: backing off before retrying http request attempts: 12 backoff time: 1m5.747227366s 2022-11-28 21:24:50.808 1m
http-transport: back-off complete, retrying http request backoff time: 1m5.747227366s 2022-11-28 21:24:50.814 6ms http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:24:51.071 257ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:24:51.078 7ms http-transport: backing off before retrying http request attempts: 13 backoff time: 10m0s 2022-11-28 21:34:51.078 10m http-transport: back-off complete, retrying http request backoff time: 10m0s 2022-11-28 21:34:51.084 6ms http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:34:51.809 725ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:34:51.814 5ms http-transport: backing off before retrying http request attempts: 14 backoff time: 8m35.786727824s 2022-11-28 21:43:27.602 9m
http-transport: back-off complete, retrying http request backoff time: 8m35.786727824s 2022-11-28 21:43:27.608 6ms http-transport: sending http request range-rq: bytes=61- received: 61 remaining: 9,438,585,695 2022-11-28 21:43:28.323 715ms
http-transport: http request error http code: 0 outputErr: error reading from http response stream: unexpected EOF 2022-11-28 21:43:28.330 7ms http-transport: terminating http transfer: exhausted max attempts err: error reading from http response stream: unexpected EOF maxAttempts: 15 2022-11-28 21:43:28.332 2ms deal failed err: data-transfer failed: could not finish transfer even after 15 attempts, lastErr: error reading from http response stream: unexpected EOF 2022-11-28 21:43:28.338 6ms cleaning up deal 2022-11-28 21:43:28.340 2ms deal finished 2022-11-28 21:43:28.347 7ms untagged funds for deal as deal finished err: untagged collateral: 4,581,287,544,491,617 untagged publish: 150,000,000,000,000,000 2022-11-28 21:43:28.355 8ms untagged storage space for deal 2022-11-28 21:43:28.357 2ms finished cleaning up deal 2022-11-28 21:43:28.359 2ms deal go-routine finished execution

Oleg0s commented 1 year ago

Same here Daemon: 1.18.0+mainnet+git.bd10bdf99.dirty+api1.5.0 Local: lotus version 1.18.0+mainnet+git.bd10bdf99.dirty boostd version 1.5.0+git.bbf3029.dirty

image


2022-11-29 04:27:47.915   executing deal proposal received from networkpeer: 12D3KooWCVXs8P7iq6ao4XhfAmKWrEeuKFWCJgqe9jGDMTqHYBjw
2022-11-29 04:27:47.940 25ms processing deal acceptance request
2022-11-29 04:27:49.217 1s tagged funds for dealtagged for deal collateral: 4,581,776,376,566,560tagged for deal publish: 1,000,000,000,000,000,000total available for collateral: 15,561,828,971,517,210,376total available for publish: 170,276,197,497,569,761,711total tagged for collateral: 4,588,119,910,151,833total tagged for publish: 55,000,000,000,000,000,000
2022-11-29 04:27:49.225 8ms created deal download staging filepath: /mnt/nodefs1/boost/incoming/c4a964fb-bc7c-4cd3-a410-230c667acf9d.download
2022-11-29 04:27:49.230 5ms inserted deal into deals DB
2022-11-29 04:27:49.233 3ms deal accepted and scheduled for execution
2022-11-29 04:27:49.233 0ms deal execution initiateddeal state:
2022-11-29 04:27:49.243 10ms size of downloaded filefilepath: /mnt/nodefs1/boost/incoming/c4a964fb-bc7c-4cd3-a410-230c667acf9d.downloadsize: 0
2022-11-29 04:27:49.246 3ms deal execution in progress
2022-11-29 04:27:49.250 4ms deal queued for transfertransfer client id: 5,344,428
2022-11-29 04:28:10.042 21s start deal data transfertransfer client id: 5,344,428
2022-11-29 04:28:10.056 14ms http-transport: execute transferdeal size: 9,438,423,282output file: /mnt/nodefs1/boost/incoming/c4a964fb-bc7c-4cd3-a410-230c667acf9d.downloadtime before context deadline: 47h59m59.99999321s
2022-11-29 04:28:10.058 2ms http-transport: existing file sizedeal size: 9,438,423,282file size: 0
2022-11-29 04:28:10.061 3ms http-transport: libp2p-http urlmultiaddr: /ip4/145.40.93.107/tcp/6747/wspeer id: 12D3KooWBUriTeu6YoJsuSg5gCvEecim9xKdZaW8fE54LUzmWJn7url: libp2p://12D3KooWBUriTeu6YoJsuSg5gCvEecim9xKdZaW8fE54LUzmWJn7
2022-11-29 04:28:10.063 2ms http-transport: started async http transfer
2022-11-29 04:28:10.063 0ms http-transport: sending http requestrange-rq: bytes=0-received: 0remaining: 9,438,423,282
2022-11-29 04:28:10.350 287ms http-transport: http request errorhttp code: 0outputErr: error reading from http response stream: unexpected EOF
jcace commented 1 year ago

Thanks for the reports guys.

I was mistaken - they won't all stop immediately as the algorithm is optimistic (it will try each deal at least one more time, before marking it as failed and then preventing future retries). So, there may be a small number trickle in as the corrupted ones get marked as such, but the spam should be greatly reduced and after a few days should zero out.

NightDog commented 1 year ago

Its been about a month since the last comment, and I am unfortunately still seing about 2/3 of deals from Estuary failing with "unexpected EOF":

Skjermbilde 2023-01-02 kl  12 30 14

Snippet of log from deal ID: 0eb9076b-790a-40e1-a0ab-7aabe1c5d697

2023-01-02 11:17:08.979 12ms http-transport: started async http transfer
2023-01-02 11:17:08.979 0ms http-transport: sending http requestrange-rq: bytes=0-received: 0remaining: 15,277,237,814
2023-01-02 11:17:09.277 298ms http-transport: http request errorhttp code: 0outputErr: error reading from http response stream: unexpected EOF
2023-01-02 11:17:09.289 12ms http-transport: some data was transferred before connection error, so resetting backoff to zerotransferred: 57
2023-01-02 11:17:09.308 19ms http-transport: backing off before retrying http requestattempts: 1backoff time: 5s
2023-01-02 11:17:14.309 5s http-transport: back-off complete, retrying http requestbackoff time: 5s
2023-01-02 11:17:14.360 51ms http-transport: sending http requestrange-rq: bytes=57-received: 57remaining: 15,277,237,757
2023-01-02 11:17:14.644 284ms http-transport: http request errorhttp code: 0outputErr: error reading from http response stream: unexpected EOF
2023-01-02 11:17:14.657 13ms http-transport: backing off before retrying http requestattempts: 2backoff time: 6.979765052s

Anyting you need to troubleshoot from my side?

jcace commented 1 year ago

Hey @NightDog !

We recently completed a big re-architecture of some core Estuary systems, which resulted in deals flowing again. There are still some of these unexpected EOF deals coming through, which are remnants of a corrupted blockstore - they should eventually taper out. cc: @en0ma

On my side I'm also still seeing quite a few of them, but the good news is there are also a lot of successful deals coming through, too.

What's your minimum piece size set to? A lot of these failed deals are larger (14+ GiB), while the newer ones are < 7GiB. So if your min piece size is set too high, you may see a higher proportion of failed deals.

Screenshot 2023-01-02 at 9 47 18 AM
NightDog commented 1 year ago

Hi @jcace,

Yea, it has really gotten a lot better for me as well, funny enough just as I posted the comment on this issue, about 6h ago: Skjermbilde 2023-01-02 kl  18 51 36

Interesting theory regarding min size. I checked the size on the the miner and it is set to 256 B. Seems that 30.1 GiB is going to complete as well knock on wood, so I guess I was just a bit impatient :)

Thank you for following up!

Edit: The 401's are both against libp2p:///ip4/3.134.223.177/tcp/6745/p2p/blablabla While the 3h ago EOF was against libp2p:///ip4/145.40.93.107/tcp/6747/ws/p2p/blablabla if that gives you anything.