Closed willscott closed 1 year ago
Thanks Will!
Deployed 20d3d61 from this PR in bifrost-gateway 2023-02-17-3e0550b
to bifrost-bank1-ny and so far so good.
Let's wait till tomorrow to confirm the panic is truly gone.
@willscott @aarshkshah1992 this PR removes panic, but we see a different problem now:
bifrost-bank1-ny:/data# curl http://127.0.0.1:8080/ipfs/bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi > /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 116k 100 116k 0 0 1228k 0 --:--:-- --:--:-- --:--:-- 1231k
runs for a few minutes, but then entire binary dies around ~4k requests:
bifrost-bank1-ny:/data# curl http://127.0.0.1:8041/debug/metrics/prometheus -s | grep caboose_fetch_err
# HELP ipfs_caboose_fetch_errors Errors fetching from Caboose Peers
# TYPE ipfs_caboose_fetch_errors counter
ipfs_caboose_fetch_errors{code="0"} 3563
ipfs_caboose_fetch_errors{code="200"} 369
bifrost-bank1-ny:/data# curl http://127.0.0.1:8080/ipfs/bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi > /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (56) Recv failure: Connection reset by peer
If there is no easy fix, I'd have to revert caboose updates before the EOD and run with the old version.
@lidel Sorry, what exactly is happening here ? Wdym the binary dies ? You mean it just freezes and stops serving Bifrost requests ? Can you post the Bifrost logs here ?
Let's merge this (panic is gone) and investigate this hang as part of https://github.com/ipfs/bifrost-gateway/issues/41. Having more than 3 layers of PRs against PRs is getting insane ;)
Closes #29