Open 5636cloud opened 2 years ago
Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review. In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment. Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:
Finally, remember to use https://discuss.ipfs.io if you just need general support.
Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review. In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment. Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:
Finally, remember to use https://discuss.ipfs.io if you just need general support.
I've run many rounds of ipfs cluster test with 1 bootstrap node (node1) and 2 nodes (node2 and node3).
And I've discovered SOMETIMES (not always) when I put a data (using
ipfs dag put
) on bootstrap node (node1), node2 (sometimes node3) can't get the CID's data (usingipfs dag get <cid>
).From the debug log I can see the node2 CAN find who has the CID's data block by using DHT network, but bitswap can't get the block and waiting indefinitely.
After some debugging, I found the source code may have some logic in package peermanager:
https://github.com/ipfs/go-bitswap/blob/master/client/internal/peermanager/peermanager.go
from line 144 to line 153:SOMETIMES node2's pm.peerQueues[p] is empty (p is node1 / bootstrap node), node2 CAN'T get the CID's data block from node1 forever.
I don't quite understand why sometimes pm.peerQueues[p] is empty.
Then I tried to add some code in
else
branch like:The logic is: when pm.peerQueues[p] is empty, call "addPeer" to add the peer in the peerQueue explicitly. After peer added, then node2 can get CID's data block from node1 successfully.
I'm wondering if the code I added is neccessary? And why the original code does not handle the case when pm.peerQueues[p] is empty ?
Thanks!