ethersphere / bee

Bee is a Swarm client implemented in Go. It’s the basic building block for the Swarm network: a private; decentralized; and self-sustaining network for permissionless publishing and access to your (application) data.
https://www.ethswarm.org
BSD 3-Clause "New" or "Revised" License
1.46k stars 337 forks source link

Stamp utilization inconsistency #4292

Open tamas6 opened 1 year ago

tamas6 commented 1 year ago

Context

1.17.3 swarm-cli; swarm desktop on mac

Summary

Telling the simtomps: I'm able to use multiple times one of my 6month old stamp, and what I bought yesterday (due to lately I heard this problem was solved) one small upload made it 100% utilized. This way swarm-cli is not able to upload with it. It says: Usage: 100% Remaining Capacity: 0 B

Expected behavior

Expect same stamp behaviour even with different depth size

Actual behavior

First is the old stamp, second is the new { "stamps": [ { "batchID": "96da9828ea47867a3a459c50e2a3a871f8b2df31d1f100494e31c88dc82beb9b", "utilization": 2, "usable": true, "label": "recovered", "depth": 18, "amount": "60000000000", "bucketDepth": 16, "blockNumber": 27757394, "immutableFlag": false, "exists": true, "batchTTL": 2098120, "expired": false }, { "batchID": "d1e7221b243645593908a39e4f8a884c59c715307a8b9422c6c778d93a02ff2b", "utilization": 2, "usable": true, "label": "no label", "depth": 17, "amount": "80000000000", "bucketDepth": 16, "blockNumber": 29828488, "immutableFlag": false, "exists": true, "batchTTL": 16620256, "expired": false } ] }

A depth 17 stamp can only have 2 chunks (2^17-16) in any given bucket, so it may hit 100% utilized with just a few chunks.

inspected the buckets and these are the results:

Steps to reproduce

I don't really know. Buy a stamp at the same details above, then after 5-6month buy same size only 17depth differs and test 🥸

Other

First feedback of the once 0% then suddenly 100% stamp was from swarm cli, after uploading 1.8mb amount of a .tar file. then I checked the logs, then proceeded to upload with the old stamp I had, the upload had no effect on its utilisation. Nevertheless the new stamp is not performing, cannot be used to upload with swarm-cli.

istae commented 1 year ago

the utilization problem is a known one

we've considered some of the things below:

  1. encryption to improve utilization - it was discovered this does NOT bring any improvements #4102
  2. require a min batch depth that should give at least 50% utilization
  3. improve documentation so that users choose to create large depth batches instead of shallow ones
ldeffenb commented 1 year ago
  1. encryption to improve utilization - it was discovered this does bring any improvements uploads should be encrypted by default #4102

Is that "does" or "does not"?

tmm360 commented 1 year ago

Why don't use a "dumb byte"?

Chunks could include a byte that is only used to avoid bin saturation with a postage batch. It is always excluded by payload, but when hash is calculated, its value is summed to the payload itself. Summing the byte to the payload permits to generate a full different hash, and to test another bin, hopefully with space in it. Formula would be hash(payload + dumbByte).

During upload each chunk is checked for postage batch saturation. When a full bin is found, the dumb byte can be incremented by 1, and the hash recalculated. This will generate a different hash, that can be tested with the new bin. If also this new bin is full, repeat. This leads to a deterministic hash mining, with at max 256 tests per chunk. More postage batch is full, more time could be consumed to try to full fil it. If after 256 tests a valid byte is not found, upload can fail.

Operation of summing the value to payload permits back-compatibility: old chunks would have the dumb byte at 0, so "hash(payload) == hash(payload+0)", where new versions are able to manage chunks with dumb byte != 0.

Obviously this would be a breaking change, and a migration would be required. Anyway, with this specific use case I think that to avoid stochastic chunks distribution would be a great improvement.

zelig commented 1 year ago

This idea is solved in three possible other ways