ipld / js-ipld-dag-pb

JavaScript Implementation of the MerkleDAG Nodes with Protobuf.
MIT License
69 stars 20 forks source link

DAGNodes resolved from a v1 CID have a v0 multihash property. #84

Closed olizilla closed 4 years ago

olizilla commented 6 years ago

When resolving a dag-pb node via ipld, we call deserialize() with the raw block data. That block may have been retrieved via a v1 CID.

To construct a DagNode instance we call DAGNode.create which builds the properties and then uses multihashing to calculate the v0 CID which is then all passed to the DagNode constructor.

https://github.com/ipld/js-ipld-dag-pb/blob/aa6274c14443003662423dcefb26573c2480f3aa/src/dag-node/create.js#L48-L53

IPLD Explorer tests for this mismatch and throws and error, as I figured it'd be a weird edge case we'd want to know about if it ever happened and it turned out, it does!

To create the issue, add a file wrapped in a directory to ipfs with cid-version 1 and try and fetch it via explore.ipld.io

$ ipfs add ~/Pictures/olizilla-rawr.png  --cid-version 1 --wrap-with-directory
added zb2rhhAD3rf4rLfajbuCH6FswmeJSD31W2qr81BCEQ1dSbUQX olizilla-rawr.png
added zdj7Wh6HAviMSd7A7EodS3xRsb3SfghZnZR3Nmtt4PoMBZqwk

Then visit https://explore.ipld.io/#/explore/zdj7Wh6HAviMSd7A7EodS3xRsb3SfghZnZR3Nmtt4PoMBZqwk

And see (if the cid resolves) screenshot 2018-08-22 17 09 54

...because the cid used to resolve the block zdj7Wh6HAviMSd7A7EodS3xRsb3SfghZnZR3Nmtt4PoMBZqwk does not match the DAGNode's multihash property. Qma12oVeVy9WHPvsth6LZVkUpTVFrF9HB4NNDG18Xjzwmp.

I can dodge this in IPLD explorer as the code has both the requested CID and the DagNode instance, so I can work around it by using the requested CID in place of the value in the DagNode, but this feels surprising and something we should fix.

In DagNode and DagLink, the 'mutlihash' property name is misleading, as it should support CIDs and the value should be available via a cid property getter. The property should return a CID object instance, as discussed in #81

It is not clear how to ensure that the correct recipe for CID construction could be passed through to DagNode.create from ipld resolve, as the API deals with blocks data rather than CIDs.

olizilla commented 6 years ago

This came up in https://github.com/ipfs-shipyard/ipld-explorer/issues/25

olizilla commented 6 years ago

Gotta be careful to only use the requested cid rather than the internal representation or your dags will get all disconnected

screenshot 2018-08-23 09 30 27

olizilla commented 6 years ago

Exposing a cid property that returned a cid object as suggested here https://github.com/ipld/js-ipld-dag-pb/pull/81#issuecomment-409887421 would be a step in the right direct. The cid instance would still internally set it's version to 0 if it's initialised with a multihash, but at least we'd be returning an object wrapper that the user can get the version 1 cid from if needed.

achingbrain commented 6 years ago

Related: https://github.com/ipld/js-ipld/issues/157

achingbrain commented 4 years ago

DAGNodes haven't exposed a cid or multihash property since v0.15.0, this can probably be closed?

olizilla commented 4 years ago

Is it no longer possible to ask a DAGNode what it's CID is?

achingbrain commented 4 years ago

Not since November 2018..

olizilla commented 4 years ago

let's close it then! For posteritry and my curioisty, if I do end up with a DAGNode of unknown origin, what is the recommended way to determine it's CID?

achingbrain commented 4 years ago

How do you end up with a DAGNode of unknown origin? Haven't you just used a CID to fetch it?

olizilla commented 4 years ago

in the guts of the IPLD explorer as per this issue

achingbrain commented 4 years ago

I'm not familiar with the guts of the IPLD explorer - do you discard the CID you used to fetch a DAGNode?

olizilla commented 4 years ago

The specific case was not my point. It is concivable that a code base passes around a DAGNode, and loses the original request context. Can you drop a code fragment here that shows a way to get a CID from a DAGNode, for any folks that end up here from a search? If there is no good way to do so, or its just not recommended then that's fine too.

achingbrain commented 4 years ago

You cannot get a CID from a DAGNode, it was removed in v0.15.0 to bring dag-pb more into line with the rest of ipld in that data doesn't tend to know how it should be addressed, and because multiple CIDs can resolve to the same DAGNode so it's hard to have a canonical CID.

If you want to know a CID for the node, assuming you know what type of node it is you can pass it to ipld.put or the util.cid function exported from the module that defines the type:

const {
  util
} = require('ipld-dag-pb')

async function calculateCid (node) {
  return util.cid(util.serialize(node))
}

This does not guarantee that you'll end up with the same CID that you used to request it though, due to potential differences in CID versions and hash algorithms so doing that won't solve the issue in ipfs-shipyard/ipld-explorer#25

The way to solve that is to not lose the CID used to fetch the node in the first place.