fission-codes / wnfs-migration

Apache License 2.0
4 stars 1 forks source link

Migration never finishes, stalls on ipfs.cat call #7

Open jeffgca opened 2 years ago

jeffgca commented 2 years ago

I ran the migration script from a git checkout and the process was initially very slow but seems to have completely stalled at the last step:

➜  wnfs-migration git:(issue-2) npm start

> wnfs-migration@2.0.1 start
> node --no-warnings --loader=ts-node/esm src/index.ts

Looking up data root for jeffg
Data root is bafybeigcrrjp5vnr77tuh3eo45xhvtv2rsxhwvylxrtxzozp6ncmg7pmra
Loading IPFS...
Connected to local ipfs node, version 0.13.0
Connected to the Fission IPFS Cluster
Looking up your filesystem version (https://jeffg.files.fission.name/version)
Your filesystem currently is at version 1.0.0
Processing public/Apps
....
Processing private/Documents
Finished migration: bafybeia54qqjhbzgp32wzo6phj4olt3fgybn3npzhbnli6cdvil6ulh4zy
? Are you sure you want to overwrite your filesystem with a migrated version? Yes

After agreeing to overwrite, the terminal sits for a very long time ( 1+ hour and still going ) with no feedback.

icidasset commented 2 years ago

I haven't found a solution for this yet, but I did add an extra message about what it's doing. Which is creating a UCAN based on your root proof, which is retrieved via IPFS (IPFS is the slow part). Message is present in 2.0.6

jeffgca commented 2 years ago

I added some extra debugging to my local clone of the script. In the second part of the process, the script stalls in the `` function here:

https://github.com/fission-suite/wnfs-migration/blame/47034941687eca74906dd6315fe050ee019abe96/src/cli/index.ts#L107

I split out the inline calls in this statement - the call to ipfs.cat seems to return something with this type Object [AsyncGenerator] {}, and then the call to itAll() with the result of the call to ipfs.cat never returns.

jeffgca commented 2 years ago

@icidasset / @matheus23 Is there a way to make the ipfs connection provide verbose logging?

matheus23 commented 2 years ago

That's really hard unfortunately. It taking more than... instant, means that for whatever reason what you're requesting isn't on your machine. But it should be! So we should tell js-ipfs to time out, or even force it to only look locally, because that's where it's supposed to find it.

I'm not sure why it can't find it.

So what exactly is happening is this: You locally have a UCAN that authenticates you against your account's root DID. A CID to that UCAN can be found in your local fission config file at ~/.config/fission/config.yml in the root_proof key. If you run ~/.config/fission/bin/fission-ipfs -c ~/.config/fission/ipfs/ cat /ipfs/<your_root_proof>/bearer.jwt you'll see that UCAN. However, exactly that IPFS command isn't resolving in your case.

I've talked to Brooke about this a while ago, we agreed that it would be better to store that UCAN somewhere else: https://talk.fission.codes/t/webnative-on-nodejs/2240/5

What I'm saying is that we've been planning to do something that would've mitigated this issue for you! I've create a github issue that captures this. https://github.com/fission-suite/fission/issues/578

ngeojiajun commented 2 years ago

I have performed some local testings based on the information on this comment and found that the CID never resolves and the DHT search revealed that no peer is providing it. I have established the connection manually to all the address i believe to be the Fission IPFS Cluster inside the config.yaml before running this test.

C:\Users\xxx>ipfs dht findprovs bafy...li

C:\Users\xxx>ipfs dht findprovs bafy...li

C:\Users\xxx>ipfs dht findprovs bafy...li

C:\Users\xxx>ipfs dht findprovs bafy...li

C:\Users\xxx>ipfs dht findprovs bafy...li

C:\Users\xxx>ipfs dht findprovs bafy...li

C:\Users\xxx>

Besides the local daemon, I have tried to access it through the CID directly through the ipfs.io however it also end up with 504 Gateway TImed Out.

matheus23 commented 2 years ago

Yeah thanks for confirming that @ngeojiajun. To be clear: It shouldn't be pinned on any fission IPFS infra and it's unlikely to be served by any IPFS instance in the network unless you're running the fission CLI's IPFS on your machine. It should "just" be in your local IPFS blockstore. Keep in mind that the fission CLI executable downloads its own ipfs executable and configures that with its own blockstore. So if you've installed ipfs on your system additionally, that'll use a different blockstore.

matheus23 commented 2 years ago

I'm thinking the root cause of this issue was switching from re-using the fission CLI's go-ipfs executable in wnfs-migration to running its own js-ipfs instance (Introduced here: https://github.com/fission-suite/wnfs-migration/pull/3/commits/40d54c8baa5fef809b2b61c290ca8e6895deea20).

This causes this issue, because then js-ipfs doesn't use the same blockstore that go-ipfs uses. I think it's not possible for js-ipfs to reuse that blockstore, because they're in different formats.

We can fix this by specifically calling go-ipfs for resolving the root_proof and nothing else. I'll do that.

ngeojiajun commented 2 years ago

In my situation, the fission-ipfs also do not resolves it. Is it pinned by default during the fission setup?

matheus23 commented 2 years ago

Yeah exactly, that's what should've happened @ngeojiajun :thinking:

matheus23 commented 2 years ago

@ngeojiajun can you please try out wnfs-migration at this branch: #10 :pray:

ngeojiajun commented 2 years ago

@matheus23 In progress.... and the fission setup does do its job by pinning it

ngeojiajun commented 2 years ago

@matheus23 Update: the UCAN is created and the actual request is seems sent. just my network is bottlenecking the actual upload.

.....
Processing private/マンガ/......pdf
Finished migration: bafybeicfmoflskgdnoohikppuknovsb5jc2dz33b2jxspu7a7q5x2mk6q4
? Are you sure you want to overwrite your filesystem with a migrated version? Yes
Creating authorization UCAN.
Created authorization UCAN. Updating data root...
matheus23 commented 2 years ago

@ngeojiajun Okay, this process can time out. However, it makes progress even if it's interrupted, so I've uploaded a new version which re-tries to update the data root in case it timed out. Please try it with the updated branch :pray:

ngeojiajun commented 2 years ago

Ok, will try it if it is timed out later on

ngeojiajun commented 2 years ago

@matheus23 can i know how long it should times out because the updating process tooks over an hour and it do not finish and the iftop shows that there are no network activity. However the query with the url https://<username>.files.fission.name/ report it is updated. And when i try to open Fission Drive it complains about the CBOR error and this

index.js:148 Error: Could not parse a valid private tree using the given key
    at Function.fromInfo (index.umd.min.js:sourcemap:70:76015)
    at Function.fromBareNameFilter (index.umd.min.js:sourcemap:70:75764)
    at async index.umd.min.js:sourcemap:70:84035
    at async Function.fromCID (index.umd.min.js:sourcemap:70:80862)
    at async Function.fromCID (index.umd.min.js:sourcemap:70:93750)
    at async Object.bc (index.umd.min.js:sourcemap:70:103254)
    at async index.js:128:9
walkah commented 2 years ago

@therealjeffg can you make sure this issue is resolved?

jeffgca commented 2 years ago

@matheus23 @walkah AFAICT this is not resolved or at least the migration does not complete. See #14