storj-archived / bridge

Deprecated. Access the Storj network via simple REST API.
https://storj.io
GNU Affero General Public License v3.0
77 stars 44 forks source link

Flood of "handling storage retrieve request from" #562

Closed Th3Van closed 6 years ago

Th3Van commented 6 years ago

Expected Behavior

Please describe the program's expected behavior. Include an example of your usage code in the back ticks below if applicable.

{"level":"info","message":"handling alloc request from 028ac2917f3b9928e2bc43c2a6956c5289973171 hash 75f3482c7c9ab0f9a321f85d211b1ff4702fe501 size 64000001","timestamp":"2018-02-04T13:21:06.703Z"} {"level":"info","message":"Sending alloc response hash 75f3482c7c9ab0f9a321f85d211b1ff4702fe501 size 64000001","timestamp":"2018-02-04T13:21:06.724Z"} {"level":"info","message":"Shard upload completed hash 75f3482c7c9ab0f9a321f85d211b1ff4702fe501 size 64000001","timestamp":"2018-02-04T13:21:47.515Z"} {"level":"debug","message":"exchange report submitted: {\"dataHash\":\"75f3482c7c9ab0f9a321f85d211b1ff4702fe501\",\"reporterId\":\"339fbcd602ce2dedab1b30bac77d16b2b48d2976\",\"farmerId\":\"339fbcd602ce2dedab1b30bac77d16b2b48d2976\",\"exchangeStart\":1517750470606,\"exchangeEnd\":1517750507516,\"exchangeResultCode\":1000,\"exchangeResultMessage\":\"SHARD_UPLOADED\"}","timestamp":"2018-02-04T13:21:47.606Z"} {"level":"info","message":"handling storage retrieve request from 2a9db1692efef0fea32f47dc6e9c50588d4f1e6e hash 75f3482c7c9ab0f9a321f85d211b1ff4702fe501","timestamp":"2018-02-04T13:21:47.811Z"} {"level":"info","message":"handling storage retrieve request from 2b6795c0aeb9e0b30149a7ce72809a0ad2342a17 hash 75f3482c7c9ab0f9a321f85d211b1ff4702fe501","timestamp":"2018-02-04T13:21:48.314Z"} {"level":"info","message":"Mirror download completed hash 75f3482c7c9ab0f9a321f85d211b1ff4702fe501 size 64000001","timestamp":"2018-02-04T13:23:07.003Z"} {"level":"debug","message":"exchange report submitted: {\"dataHash\":\"75f3482c7c9ab0f9a321f85d211b1ff4702fe501\",\"reporterId\":\"339fbcd602ce2dedab1b30bac77d16b2b48d2976\",\"farmerId\":\"339fbcd602ce2dedab1b30bac77d16b2b48d2976\",\"exchangeStart\":1517750508856,\"exchangeEnd\":1517750587003,\"exchangeResultCode\":1000,\"exchangeResultMessage\":\"SHARD_DOWNLOADED\"}","timestamp":"2018-02-04T13:23:07.120Z"} {"level":"info","message":"Mirror download completed hash 75f3482c7c9ab0f9a321f85d211b1ff4702fe501 size 64000001","timestamp":"2018-02-04T13:23:07.132Z"} {"level":"debug","message":"exchange report submitted: {\"dataHash\":\"75f3482c7c9ab0f9a321f85d211b1ff4702fe501\",\"reporterId\":\"339fbcd602ce2dedab1b30bac77d16b2b48d2976\",\"farmerId\":\"339fbcd602ce2dedab1b30bac77d16b2b48d2976\",\"exchangeStart\":1517750508463,\"exchangeEnd\":1517750587132,\"exchangeResultCode\":1000,\"exchangeResultMessage\":\"SHARD_DOWNLOADED\"}","timestamp":"2018-02-04T13:23:07.219Z"}


### Actual Behavior

Please describe the program's actual behavior. Please include any stack traces
or log output in the back ticks below.

Farmer sends a `ALLOC` response to the bridge, but never receives the shard.
For reasons unknown, the bridge tries to mirror the the non-existing ghost shard to other farmers, resulting in a endless loop of storage retrieve request from the bridge. I've seen this happening quite often, and sometimes it can go on for days.

- Farmer handles a `ALLOC` request.
- Farmer sends `ALLOC` response to bridge.
- Farmer never receives the shard...
- Farmer receives a ton of `storage retrieve request` for mirroring the "ghost" shard to two other farmers.

root@server031-storj-dk:/disk001/storjshare/logs# grep -h "9a96cc000d94346f79e1eda77b9980cb0e4fbb99" 339fbcd602ce2dedab1b30bac77d16b2b48d2976_2018--.log

{"level":"info","message":"handling alloc request from 19632fadaefbca94d6ba79a668618a63c1516483 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99 size 5241856","timestamp":"2018-01-29T23:44:44.227Z"} {"level":"info","message":"Sending alloc response hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99 size 5241856","timestamp":"2018-01-29T23:44:44.254Z"} {"level":"info","message":"handling storage retrieve request from 355915ad98635a45e90e78a4106f38ebc0003a80 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-01-30T06:53:55.749Z"} {"level":"info","message":"handling storage retrieve request from 382d1b5ffb2e74a5cd85ea37b1c2c642a914f341 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-01-30T06:54:21.486Z"} {"level":"info","message":"handling storage retrieve request from 29f8d40a74a6f0054afe3ed9323a22ca15c9ea88 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-01-30T06:54:50.198Z"} {"level":"info","message":"handling storage retrieve request from 35e18bc7d761b8793b97fb9048279d2cfdb4ab26 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-01-30T07:03:18.012Z"} {"level":"info","message":"handling storage retrieve request from 3315a53855115eebff8414a03457ecb33a6b99d7 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-01-30T07:04:08.098Z"} {"level":"info","message":"handling storage retrieve request from 2c61a46052bf1846f2b853f332e2853e6b3bf343 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-01-30T07:04:56.884Z"} {"level":"info","message":"handling storage retrieve request from 355915ad98635a45e90e78a4106f38ebc0003a80 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-01-30T07:07:00.257Z"} {"level":"info","message":"handling storage retrieve request from 3d34ec91dacf43c1cdcbf90d8c15cb36000c34bd hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-01-30T07:08:17.021Z"} {"level":"info","message":"handling storage retrieve request from 32cb2c04e213ba6d8e1ee6496a9328f8438409fc hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-01-30T07:08:39.099Z"}

+3500 requests and four days later :

{"level":"info","message":"handling storage retrieve request from 3439d79e19b26af8e9bab492d161f45c48beca11 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-02-02T03:46:21.314Z"} {"level":"info","message":"handling storage retrieve request from 3315a53855115eebff8414a03457ecb33a6b99d7 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-02-02T03:47:47.120Z"} {"level":"info","message":"handling storage retrieve request from 35e18bc7d761b8793b97fb9048279d2cfdb4ab26 hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-02-02T03:49:19.000Z"} {"level":"info","message":"handling storage retrieve request from 309883653dbead04cdf723ccde3897c539d34fce hash 9a96cc000d94346f79e1eda77b9980cb0e4fbb99","timestamp":"2018-02-02T03:50:34.657Z"}


_Logfile_ : [Flood_of_Handling_storage_retrieve_requests_Log_Th3Van.txt](https://github.com/Storj/bridge/files/1693225/Flood_of_Handling_storage_retrieve_requests_Log_Th3Van.txt)

Related ? -> https://github.com/Storj/bridge/issues/515

### Steps to Reproduce

Please include the steps the reproduce the issue, numbered below. Include as
much detail as possible.

1. Install `storjshare`
2. Set log to level 4 (debug)
4. Wait until you have received a few `ALLOC`s 
5. grep logfile for "handling storage retrieve request from"
littleskunk commented 6 years ago

Sorry that is my fault. My farmer is producing the loop. Let me fix that and hopefully that will kill the loop.

littleskunk commented 6 years ago

On a closer look the last core release could have the same effect.

@Th3Van do you have a new example for me? I fixed the loop on my side this hour.

littleskunk commented 6 years ago

A short update: It looks like the farmer are storing broken shards. The transfer was terminated early but the farmer didn't delete the shard. In combination with my last core pull request this could cause a create mirror loop.

I have a script to compare the shard hashes with the contract and found a significant high number of broken shards on my farmer. I will track them down and create a new issue as soon as I know the reason. In the meantime we might have to revoke the core pull request to brake the loop. The shard will still be lost forever but the bridge will stop creating new mirrors.

More details will follow.

RichardLitt commented 6 years ago

👋 Hey! Thanks for this contribution. Apologies for the delay in responding!

We've decided to rearchitect Storj, so that we can scale better. You can read more about this decision here. This means that we are entirely focused on v3 at the moment, in the storj/storj repository. Our white paper for v3 is coming very, very soon - follow along on the blog and in our Rocketchat.

As this repository is part of the v2 network, we're no longer maintaining this repository. I am going to close this for now. If you have any questions, I encourage you to jump on Rocketchat and ask them there. Thanks!