ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
16.18k stars 3.02k forks source link

Long hierarchical IPFS paths fail to resolve (HAMT, object links) #4431

Closed pengowray closed 5 years ago

pengowray commented 6 years ago
  1. Say you have an image of a blue whale from Wikipedia:

> ipfs cat /ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco/I/m/Anim1754_-_Flickr_-_NOAA_Photo_Library.jpg > whale.jpg #OK

  1. But you want to resolve the address down to a plain hash (without the directory structure):

> ipfs resolve -r /ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco/I/m/Anim1754_-_Flickr_-_NOAA_Photo_Library.jpg #FAILS

  1. Then you get the following error:

Error: no link named "I" under QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco


ipfs.io web interface

Same issue, different interface

  1. Working image:

https://ipfs.io/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco/I/m/Anim1754_-_Flickr_-_NOAA_Photo_Library.jpg

  1. Attempt to resolve:

https://ipfs.io/api/v0/resolve?arg=/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco/I/m/Anim1754_-_Flickr_-_NOAA_Photo_Library.jpg&r=true

  1. Error message:
    {"Message":"no link named \"I\" under QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco","Code":0}

Version information:

go-ipfs version: 0.4.12-be03fc4
Repo version: 6
System version: amd64/linux
Golang version: go1.9.2

Type: Bug

Severity: High?

kevina commented 6 years ago

The problem is that ipfs resolve uses core.Node.Resolver which as far as I can tell uses ResolveSingle. ipfs cat construct its own path.Resolver which uses unixfs/io.ResolveUnixfsOnce. I also notice that ipfs get uses the same resolver that ipfs resolve uses and hence also fails.

kevina commented 6 years ago

The real problem is that the IPFS Wikipedia archive uses HAMT shard directories. In order to correctly traverse HAMT shard directories a special purpose resolver is needed. All the ResolveSingle does is call node ResolveLink in the Node interface. This will not work with HAMT shard directories as all the links are not in a single block so a special purpose resolver is needed.

I am not sure the best way to fix it, should ResolveLink resolve the abstract like and not the concrete link inside the current block. Should we define another method to do so?

Another fix is to make the default resolver for the node use ResolveUnixfsOnce which I am attempting to do in #4444. I am waiting to see if those tests pass.

@Stebalien @whyrusleeping thoughts?

whyrusleeping commented 6 years ago

Hrm... this is tricky indeed. Its almost like we should select the resolver based on context of the root hash.

pengowray commented 6 years ago

As a UI suggestion, perhaps you should add something like a --trace option to resolve -r (name inspired by Unix's traceroute). Something that would enumerate each of the steps/nodes along the way to resolving an IPFS address, and which resolver was required for each step. It could help newbies with debugging and with understanding the protocol.

Without -r the trace would be the same but just give the first step.

And it sounds like you also need to add something like a --resolver:auto or --resolver:hamt type of setting too to let the user force the type of resolving they'd like done, unless it's always meant to be invisible to the user.

Stebalien commented 6 years ago

As I noted in #4444, we should be using /ipld and /ipfs. That is, we should have dispatching resolver that takes /ns/path, picks a sub-resolver based on ns, and passes that sub resolver the path.

Stebalien commented 5 years ago

Fixed by the CoreAPI refactor.