ethersphere / bee

Bee is a Swarm client implemented in Go. It’s the basic building block for the Swarm network: a private; decentralized; and self-sustaining network for permissionless publishing and access to your (application) data.
https://www.ethswarm.org
BSD 3-Clause "New" or "Revised" License
1.45k stars 338 forks source link

Overzealous (and constant) single chunk redundancy #4705

Closed ldeffenb closed 2 months ago

ldeffenb commented 3 months ago

Context

2.1.0

Summary

I'm trying to evaluate the impact of including redundancy on the next version upload of the OSM dataset which is very heavy with single chunk mantaray nodes. But according to my trace logs, specifying ANY redundancy level but zero results in 17 chunks being stamped and pushed into the swarm instead of the original single 1. This is WAY too expensive for a dataset which has 26,309,262 mantaray node chunks indexing 22,481,003 individual PNG files. That 26 million would be multiplied by 17!

Expected behavior

I expected the different redundancy levels to generate different numbers of chunks for this 128 byte /bytes upload, but every non-zero level generates the same 17 chunks, 16 of which are apparently generated by the redundancy code.

Actual behavior

Using this file which is a binary copy of a mantaray node (you need to unzip it): d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6.zip And this command to do the upload:

curl -X POST -H "swarm-deferred-upload: false" -H "swarm-postage-batch-id: {putYourBatchHere}" -H "swarm-redundancy-level: 0" -q --data-binary @d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6.bytes http://localhhost1633/bytes

as expected, I get the following response: {"reference":"d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"} My debug logs (see Steps to reproduce) show me the single chunk.

"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Done" "ref"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"

But if I set the swarm-redundancy-level to any value from 1-4,

curl -X POST -H "swarm-deferred-upload: false" -H "swarm-postage-batch-id: {putYourBatchHere}" -H "swarm-redundancy-level: 1" -q --data-binary @d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6.bytes http://localhhost1633/bytes
curl -X POST -H "swarm-deferred-upload: false" -H "swarm-postage-batch-id: {putYourBatchHere}" -H "swarm-redundancy-level: 2" -q --data-binary @d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6.bytes http://localhhost1633/bytes
curl -X POST -H "swarm-deferred-upload: false" -H "swarm-postage-batch-id: {putYourBatchHere}" -H "swarm-redundancy-level: 3" -q --data-binary @d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6.bytes http://localhhost1633/bytes
curl -X POST -H "swarm-deferred-upload: false" -H "swarm-postage-batch-id: {putYourBatchHere}" -H "swarm-redundancy-level: 4" -q --data-binary @d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6.bytes http://localhhost1633/bytes

I get the following logs. Note that the order of the chunks being logged varies, but it is always the same additional 16 chunks.

"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Put" "chunk"="92c8d84741368ed08b3d9df4a1924c93fd7c5c71c92bfad25ab462f31f285ed2"
"msg"="putTrace: Put" "chunk"="33f4de243b13d27b7e1a9cbf302b21e282a7813ff5cd7a8711fcf7f8a4078bcb"
"msg"="putTrace: Put" "chunk"="fbb12554ff101f5f71e7de7da03a69396843ede907da93b49d65b66736162fcc"
"msg"="putTrace: Put" "chunk"="d2b1cff38b6efcc3f336e2fc898fc40ac8d5b0290f614cd8046eaaa2039e8c9e"
"msg"="putTrace: Put" "chunk"="ab0976831ccb2a92666cc5670a3fc48353ebb3c3e534fd493bee9bfb20ef0c9c"
"msg"="putTrace: Put" "chunk"="6aa9bb4def9e3847d19edfa502a4a2eb4f4ab203a446d3de744ca29b0abd337e"
"msg"="putTrace: Put" "chunk"="1cee43ae46aee79e31246d6bbbb4b19a8f0d8cf8e398c1e63da900b19acf10bf"
"msg"="putTrace: Put" "chunk"="e053b743974bf28d9eeedd501f9ddac62b5ec504a3d653e2b711bafe91412386"
"msg"="putTrace: Put" "chunk"="54926a66b4a10f79ccc1df26bad7b3cc5db15ab9c11879fd536ec1be0dd3542c"
"msg"="putTrace: Put" "chunk"="83e7774ad50864e5eecf6e2981eebd6fcf38b4b50b6548c253d3cdd9d4e8ffbe"
"msg"="putTrace: Put" "chunk"="0104024d50e391823a2bd7ba19720cb6b1f531abacf570e03ac6ff91aefb648c"
"msg"="putTrace: Put" "chunk"="23a1105fb7de6953c835b7015c8494c8d1ea035aaebe2b60b3c8235f5564cca6"
"msg"="putTrace: Put" "chunk"="bdfafec94241c8f699c1bcfa27a34fef3a6b58d44bc6c8d3508d090ad3669051"
"msg"="putTrace: Put" "chunk"="77a6ee451130cb00d6aff52f339f99d95af40903eb8f665ca04114a7eedfa10f"
"msg"="putTrace: Put" "chunk"="48112bdca4e2ade9b0ef54f3f83d3a778ad3173ef68a5da210e512d6fe5db775"
"msg"="putTrace: Put" "chunk"="c849e53e7bf37f8770dfee71fcca69348eb9372ce16eea8ac3cfff5819acf58f"
"msg"="putTrace: Done" "ref"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"

I also find it strange that the root chunk reference goes through the Put() method twice. Seems a bit redundant if you ask me.

Steps to reproduce

Add the putter logs found in my hacked version then execute the previous upload commands. You'll see the generated chunks that are being pushed into the swarm. https://github.com/ldeffenb/bee/blob/f73bf0897dd3c401751241c5aab3a1819a5bb9d3/pkg/api/api.go#L792 https://github.com/ldeffenb/bee/blob/f73bf0897dd3c401751241c5aab3a1819a5bb9d3/pkg/api/api.go#L801

Possible solution

I really don't know, but I suspect this hard-coded constant has something to do with it. Note that this applies for any non-zero level, although I cannot explain why I get 16 chunks when this constant is 8. https://github.com/ethersphere/bee/blob/8c61408db7cd63227cfe161967e3b70f90b9fde3/pkg/file/redundancy/redundancy.go#L58-L60

Well, I don't know what that 8 is, but the actual issue is described in my comment stream below. The fix for the /bytes API is: https://github.com/ldeffenb/bee/blob/1c37897ff8b10890d18b624b0a9ed5d4757c972a/pkg/api/bytes.go#L48-L49 The /bzz API probably needs a similar fix.

ldeffenb commented 3 months ago

It'd be really nice if this array were actually used to generate the redundant root chunks by level: https://github.com/ethersphere/bee/blob/8c61408db7cd63227cfe161967e3b70f90b9fde3/pkg/file/redundancy/level.go#L166 But is this actually a redundant declaration of something that should match? https://github.com/ethersphere/bee/blob/8c61408db7cd63227cfe161967e3b70f90b9fde3/pkg/replicas/replicas.go#L117

ldeffenb commented 3 months ago

I think I just found the code that explains why the redundancy chunks aren't always in the same order. https://github.com/ethersphere/bee/blob/8c61408db7cd63227cfe161967e3b70f90b9fde3/pkg/replicas/putter.go#L42-L53

ldeffenb commented 3 months ago

And maybe the issue is that the redundancy level just isn't getting into the context? https://github.com/ethersphere/bee/blob/8c61408db7cd63227cfe161967e3b70f90b9fde3/pkg/replicas/putter.go#L33-L39 And now that I see THIS, I'm pretty sure that's the underlying issue: https://github.com/ethersphere/bee/blob/8c61408db7cd63227cfe161967e3b70f90b9fde3/pkg/file/redundancy/level.go#L176-L182 And I don't see anywhere but tests that actually invoke the set (other than to set it to redundancy.NONE): https://github.com/ethersphere/bee/blob/8c61408db7cd63227cfe161967e3b70f90b9fde3/pkg/file/redundancy/level.go#L171-L173

ldeffenb commented 3 months ago

Ok, that was definitely the issue. I patched my hacks version with the following code and now the swarm-redundancy-level value is being honored. Similar code likely needs to be added to api/bzz.go as well, but I don't use that API. https://github.com/ldeffenb/bee/blob/1c37897ff8b10890d18b624b0a9ed5d4757c972a/pkg/api/bytes.go#L48-L49 These are the logs for 0, 1, 2, 3. The logs for 4 are shown above.

"msg"="putTrace: bytesUploadHandler: SetLevelInContext" "header"=0 "GetLevel"=0
"msg"="putTrace: bytesUploadHandler" "rlevel"=0
"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Done" "ref"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"

"msg"="putTrace: bytesUploadHandler: SetLevelInContext" "header"=1 "GetLevel"=1
"msg"="putTrace: bytesUploadHandler" "rlevel"=1
"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Put" "chunk"="54926a66b4a10f79ccc1df26bad7b3cc5db15ab9c11879fd536ec1be0dd3542c"
"msg"="putTrace: Put" "chunk"="d2b1cff38b6efcc3f336e2fc898fc40ac8d5b0290f614cd8046eaaa2039e8c9e"
"msg"="putTrace: Done" "ref"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"

"msg"="putTrace: bytesUploadHandler: SetLevelInContext" "header"=2 "GetLevel"=2
"msg"="putTrace: bytesUploadHandler" "rlevel"=2
"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Put" "chunk"="33f4de243b13d27b7e1a9cbf302b21e282a7813ff5cd7a8711fcf7f8a4078bcb"
"msg"="putTrace: Put" "chunk"="83e7774ad50864e5eecf6e2981eebd6fcf38b4b50b6548c253d3cdd9d4e8ffbe"
"msg"="putTrace: Put" "chunk"="d2b1cff38b6efcc3f336e2fc898fc40ac8d5b0290f614cd8046eaaa2039e8c9e"
"msg"="putTrace: Put" "chunk"="54926a66b4a10f79ccc1df26bad7b3cc5db15ab9c11879fd536ec1be0dd3542c"
"msg"="putTrace: Done" "ref"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"

"msg"="putTrace: bytesUploadHandler: SetLevelInContext" "header"=3 "GetLevel"=3
"msg"="putTrace: bytesUploadHandler" "rlevel"=3
"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Put" "chunk"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"
"msg"="putTrace: Put" "chunk"="0104024d50e391823a2bd7ba19720cb6b1f531abacf570e03ac6ff91aefb648c"
"msg"="putTrace: Put" "chunk"="83e7774ad50864e5eecf6e2981eebd6fcf38b4b50b6548c253d3cdd9d4e8ffbe"
"msg"="putTrace: Put" "chunk"="33f4de243b13d27b7e1a9cbf302b21e282a7813ff5cd7a8711fcf7f8a4078bcb"
"msg"="putTrace: Put" "chunk"="6aa9bb4def9e3847d19edfa502a4a2eb4f4ab203a446d3de744ca29b0abd337e"
"msg"="putTrace: Put" "chunk"="ab0976831ccb2a92666cc5670a3fc48353ebb3c3e534fd493bee9bfb20ef0c9c"
"msg"="putTrace: Put" "chunk"="fbb12554ff101f5f71e7de7da03a69396843ede907da93b49d65b66736162fcc"
"msg"="putTrace: Put" "chunk"="d2b1cff38b6efcc3f336e2fc898fc40ac8d5b0290f614cd8046eaaa2039e8c9e"
"msg"="putTrace: Put" "chunk"="54926a66b4a10f79ccc1df26bad7b3cc5db15ab9c11879fd536ec1be0dd3542c"
"msg"="putTrace: Done" "ref"="d50c26a504619838ff1fd1cc86ed69b9b6b380f484b3a0444015df500322f4c6"

Note that there's still an issue with the original root chunk being Put twice for non-zero redundancy levels.