ethersphere / bee

Bee is a Swarm client implemented in Go. It’s the basic building block for the Swarm network: a private; decentralized; and self-sustaining network for permissionless publishing and access to your (application) data.
https://www.ethswarm.org
BSD 3-Clause "New" or "Revised" License
1.44k stars 338 forks source link

Deferred uploads lock in a loop #4633

Open ldeffenb opened 3 months ago

ldeffenb commented 3 months ago

Context

2.0.0

Summary

I'm uploading a lot of files using the /bytes API with deferred and pin options. I noticed in the --verbosity 5 logs that the pusher seems to be looping over and over on individual chunks. This seems to be related to #4632 which is not shown in my log capture below because it doesn't match the grep for the chunk reference.

Expected behavior

Once a chunk is uploaded and pushed, I would expect it to be happy and quit trying. But I'm guessing that somehow this chunk ended up in the upload queue twice (or more).

Actual behavior

I realize this is a long and verbose set of logs, but it demonstrates the issue. These logs are my enhanced chunk refCnt tracing logs available at: https://github.com/ldeffenb/bee/tree/84d314d9e84c45171bec3d80765582d75b30bf91 [4ea18e1b0fe582988669279325354dfd86369dcbbdf61fb93744f1f4a655185f.log]

(https://github.com/ethersphere/bee/files/14854805/4ea18e1b0fe582988669279325354dfd86369dcbbdf61fb93744f1f4a655185f.log)

Steps to reproduce

I'm guessing you just need to do lots of deferred pinned uploads with /bytes, but certainly my OSM data uploader is causing it.

Possible solution

Need to determine if there's a way that a single chunk can end up in the upload/pusher queue more than once as that will definitely cause what I'm seeing when the first successful upload deletes the uploadItem and subsequent pushes fail to do the Report() causing them to continue looping forever.

ldeffenb commented 3 months ago

This also shows up in the metrics as the push rate and push error rate skyrocket and the receipt depth quits incrementing since Report() is failing even though a receipt was received. image