Open kvm2116 opened 6 years ago
@Stebalien Any thoughts? I am trying to debug, but without much luck so far.
When that happens, could you check ipfs swarm peers --streams
to see if the machines are still connected and if they have open bitswap streams?
This could also be https://github.com/ipfs/go-ipfs/issues/5183.
This is what the output of "ipfs swarm peers --streams" looks like when it is stuck
mkunal@node-1:~$ ipfs swarm peers --streams /ip4/128.110.153.130/tcp/4001/ipfs/QmYgbFxWGykyTMNrotndQmoV67RpixAsus6zkRzk5bB66Q /ipfs/bitswap/1.1.0 /ipfs/kad/1.0.0 /ipfs/kad/1.0.0
Damn. Assuming peer A is downloading and peer B is serving, can you run ipfs bitswap wantlist -p $PEER_A
on peer B and ipfs bitswap wantlist
on peer A? This is looking more like #5183.
On peer B (serving): $ipfs bitswap wantlist -p QmRHCnAmHGjZACNND1v5AJw6ndCPyrN5nT7k5umSsZ69D4 QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg QmXC8zjajA5q3RRJa9oXJxkVBN5ibihjnJvPZRayMERCjC QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv
On peer A (downloading): $ ipfs bitswap wantlist QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg QmXC8zjajA5q3RRJa9oXJxkVBN5ibihjnJvPZRayMERCjC QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv
On peer B (serving), I verified the blocks exist by printing the raw data using "ipfs object get QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv"
Interesting...
Does peer B have those blocks? Does it hang when running ipfs block stat QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg
?
No hanging on peer B.
$ipfs block stat QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg Key: QmXFRdKuZiCcnRiTNyEcKtwL127WDjYrnu4C5pScykR5hg Size: 262158
$ ipfs block stat QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv Key: QmSw6qxguQ3ZSk8ufA416pVvVHwSotKppQg54VUHEoGrrv Size: 262158
$ ipfs block stat QmXC8zjajA5q3RRJa9oXJxkVBN5ibihjnJvPZRayMERCjC Key: QmXC8zjajA5q3RRJa9oXJxkVBN5ibihjnJvPZRayMERCjC Size: 262158
On peer A (downloading), it hangs for the above 3 commands.
Awesome! Well, not for you but this definitely looks like #5183. Could you follow the instructions here and upload the result to github (or ipfs)? That'll help me figure out where IPFS is stuck.
Here are the results:
EDIT: added the binary issue_dump.tar.gz
Can I get the exact binary you used as well?
My apologies. Added the binary in the tar.
Which peer is this and would you mind doing the other peer as well?
Awesome! Thanks!
One more question. Is this the first time you've started these nodes (trying to reproduce).
I am renting these machines from Cloudlab (https://www.cloudlab.us/).
Since the time I obtained them (about 8 days ago), they have been running. I didn't reboot them.
Does that answer the question?
Please let me know how I can help to identify or fix the issue.
Not sure if this helps though I would like to point out that using "ipfs get file.txt" works as it should. It downloads the file in its entirety without any issues.
Oh. Hm. That's really odd. Can you run on ipfs refs -r $file
on peer A. It looks like it actually has the file in question, it just doesn't realize it...
Just to make sure file.txt
is the file you'r trying to download through fuse, right?
Yes, file.txt is the file trying to be downloaded.
In the output, it prints one line and then hangs. $ ipfs refs -r /ipns/QmYgbFxWGykyTMNrotndQmoV67RpixAsus6zkRzk5bB66Q/file.txt QmaQ7e4fEWRizAUgGe5Dbbf5oAb8RfNtJCUpYxpYmQNbUn
Ah, I think I see what you mean.
In the earlier experiment, which worked: On peer B, I did "ipfs add file.txt" -> output_hash On peer A, "ipfs get output_hash"
In the /ipns mount scenario, which is hanging: On peer B, "cp file.txt /ipns/local/file.txt" On peer A, "head -c 1000000 /ipns/peer_a/file.txt"
Wait, but ipfs get /ipns/QmYgbFxWGykyTMNrotndQmoV67RpixAsus6zkRzk5bB66Q/file.txt
succeeds? Is this on peer A?
I just tried that and it succeeds.
on peer A (the client trying to download): ipfs get /ipns/QmYgbFxWGykyTMNrotndQmoV67RpixAsus6zkRzk5bB66Q/file.txt Saving file(s) to file.txt 286.17 MiB / 286.17 MiB [===========================================================] 100.00% 12s
Does it actually download the complete file (trying to rule out a bug in ipfs get
).
And thank you for all your help on this!
Sure thing. Happy to help! Please feel free to let me know any way I can help.
I did "cat file.txt" and it printed the entire file. I also did a diff on the downloaded file and the original file and they are identical.
So, to summarize, on peer A:
ipfs get /ipns/Qm...
works.ipfs refs -r /ipns/Qm...
hangs.cat /ipns/Qm...
hangs.Yes. Any ideas on what might be causing this issue? I have the entire weekend available to run more tests/possibly implement the solution.
No but the get
versus refs
difference is probably going to narrow this down a lot.
If you re-run these commands (https://github.com/ipfs/go-ipfs/issues/5328#issuecomment-415520234), do you get the same results?
No, it varies. Sometimes more entries, sometimes less.
What's the output of ipfs diag cmds
?
(on both machines)
I ran "cat /ipns/Qm.../file.txt". After it hangs, obtained the output of "ipfs diag cmds".
Peer A (downloader): Command Active StartTime RunTime repo/gc false Aug 24 15:39:03 14.671006ms refs false Aug 24 15:39:09 5.458619543s repo/gc false Aug 24 15:57:06 135.123628ms pin/ls false Aug 24 15:57:11 254.572µs refs false Aug 24 15:57:23 5.416848016s pin/ls false Aug 24 16:21:03 231.248µs repo/gc false Aug 24 16:21:06 144.058141ms diag/cmds true Aug 24 16:22:42 1.515304ms
Peer B (sender): Command Active StartTime RunTime bitswap/wantlist false Aug 24 15:36:00 888.235µs bitswap/wantlist false Aug 24 15:36:07 227.549µs bitswap/wantlist false Aug 24 15:36:48 352.875µs bitswap/wantlist false Aug 24 15:38:27 222.73µs diag/cmds true Aug 24 16:23:21 1.188582ms
And does ipfs get
still work? Does ipfs refs
still hang?
Internally, it looks like anything that might cause ipfs refs
to hang should cause ipfs get
to hang.
ipfs get /ipns/Qm.../file.txt works ipfs refs /ipns/Qm.../file.txt works
cat /ipns/Qm.../file.txt hangs After its hanging, get the list of hashes using "ipfs bitswap want list" For any one of the hashes from the list (let's say Qmhash), "ipfs refs Qmhash" prints couple lines of hashes and then hangs.
On Aug 24, 2018, at 6:45 PM, Steven Allen notifications@github.com wrote:
And does ipfs get still work? Does ipfs refs still hang?
Internally, it looks like anything that might cause ipfs refs to hang should cause ipfs get to hang.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ipfs/go-ipfs/issues/5328#issuecomment-415901579, or mute the thread https://github.com/notifications/unsubscribe-auth/AToQMDc04Sts6zmseQgcR7QmRwWHFGRCks5uUIIXgaJpZM4Vq3V- .
One more data point :
cat /ipns/Qm.../file.txt hangs
But, if I run ipfs refs /ipns/Qm.../file.txt (which works) and then "cat /ipns/Qm.../file.txt" (this succeeds)
On Aug 24, 2018, at 6:45 PM, Steven Allen notifications@github.com wrote:
And does ipfs get still work? Does ipfs refs still hang?
Internally, it looks like anything that might cause ipfs refs to hang should cause ipfs get to hang.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ipfs/go-ipfs/issues/5328#issuecomment-415901579, or mute the thread https://github.com/notifications/unsubscribe-auth/AToQMDc04Sts6zmseQgcR7QmRwWHFGRCks5uUIIXgaJpZM4Vq3V- .
Version information:
go-ipfs version: 0.4.18-dev-4bca53e Repo version: 7 System version: amd64/linux Golang version: go1.10
Type:
Bug
Description:
I created a text file (size 300MB) with random ascii characters using the following command: base64 /dev/urandom | head -c 300000000 > file.txt Added this text file to /ipns mount on the machine
On another machine, I tried reading a third of the file using head: head -c 1000000 /ipns/QmezBPmm4RRBRTBYPhVHsSSAPS2Q8hRTDd3PrfTsG8yvnf/file.txt
This command prints some number of lines and then gets stuck permanently. Using debugging log, the following line is repeated "10:42:23.157 DEBUG bitswap: 9 keys in bitswap wantlist workers.go:185"