ipld / js-codec-interface

IPLD Codec Interface
2 stars 1 forks source link

Status of this interface? #5

Closed oed closed 3 years ago

oed commented 4 years ago

Hey I'm creating a new codec and I'm wondering what interface my codec should conform to. It looks like the current js-ipld does not use this interface (both dag-pb and dag-cbor conforms to a different interface). dag-json seems to be using this but it doesn't look like it's supported in js-ipld. Is this interface simply deprecated, or is it just a WIP that will be used in the future?

mikeal commented 4 years ago

We’re right in the middle of migrating interfaces, there’s actually 3+ interfaces right now but I can guarantee you will have a good time if you do the following:

Write a module that exports:

{ name, // string name of the codec
  code, // integer for the multiformat entry of the codec
  encode, // function that takes a JS object and returns a Buffer
  decode // function that takes a Buffer and returns a JS object
}

I’m in the middle of a very large multiformats re-write which will consume this interface, among other things. This is the signature codecs should produce for the new interface and within the next week I’ll have code that you can use to turn this interface into the old interface that js-ipld expects. That way, you’ll be compatible with the new multiformats interface, the new Block interface, and the old js-ipld interface, and you’ll need to write way less code than the old js-ipld interface requires (we can just generate all the other functions and properties).

oed commented 4 years ago

Ok, that's helpful. Main question I have is around how I can expose additional methods on the JS object in the best way. For example dag-pb exposes the toJSON method on the DAGNode class. I want to do something similar but for verify which is a method that verifies a signature that is encoded within the block, given a public key.

rvagg commented 4 years ago

One way taken by a couple of codecs is to expose functions but make those properties not enumerable (Object.defineProperty()), so they're there but not really. If your codec has very strictly defined properties that don't vary then you may not even have to do this because you can just define in your documentation what to expect and what not. For the codecs where properties are flexible and could be named almost anything (and hence involve clashes) then it's much trickier and involves tradeoffs. You could also consider forcing your API users to go back to some central factory to perform those methods:

const obj = codec.decode(block)

// then for some property that's not strictly part of the deserialization, say a calculation:

const calculatedWeight = obj.calculatedWeight // maybe this is not enumerable to highlight its ephemeral nature
// OR
const calculatedWeight = obj.calculateWeight()
// OR
const calculatedWeight = codec.calculateWeight(obj)

Overall though, this is one key problem with our pattern of instantiating full JavaScript objects from our codecs, we're trapped within the JavaScript object paradigm and can't introduce novelty around it. My personal suspicion is that we'll eventually move away from this pattern to something more like jQuery but for IPLD node navigation and manipulation rather than DOM navigation @mikeal started experimenting with this at one point (on hold for now), and it's roughly how go-ipld-prime handles IPLD data. There's other parts of the stack that need to be tinkered with to make this work nicely (all the way down to our codecs, JSON, CBOR, and more). We're not there yet and for now you have to do your best to fit the pattern.

mikeal commented 4 years ago

I would caution against exposing any functions at all, and I wouldn’t use the dag-pb codec as a blueprint as it’s quite old and will probably have to take a large breaking change in the not too distant future (we’ll need to move to something that follows the new IPLD Schema representation if we want to support selectors).

Codecs are meant to decode binary into “IPLD Data Model” and vice versa. Any additional behavior like functions or additional mutations need to live in separate libraries.

In other words, the encode and decode should only return and accept basic types (JSON types + CID and Buffer). The DAGNode class in dag-pb is legacy and will eventually be deprecated entirely.

oed commented 4 years ago

Ok, I think that makes sense. Will have to think a bit more about how to design my codec with this in mind.

Codecs are meant to decode binary into “IPLD Data Model” and vice versa. Any additional behavior like functions or additional mutations need to live in separate libraries.

Curious how you think about decryption in this context. Could that be handled by the encode/decode functions in some way?

mikeal commented 4 years ago

Curious how you think about decryption in this context. Could that be handled by the encode/decode functions in some way?

These shouldn’t be handled at the codec layer, they should be a layer above, but there are ways to make that easier the more aware you are of how interactions with blocks will work.

For instance, the decryption function should take decoded data rather than encoded data. That way, decryption for both dag-jose and dag-cose would look similar.

const getBlock = require(‘./my-storage-layer’)
const decrypt = require(‘dag-cose/decrypt’)
const getData = async someCID => {
  const block = await getBlock(someCID)
  let decoded = block.decode()
  if (block.codec === ‘dag-cose’ || block.codec === ‘dag-jose’) {
     decoded = await decrypt(decoded)
  }
  return decoded
}
oed commented 4 years ago

Cool, right now I have a hard time understanding how this bubbles up to the ipfs api. Your example is really clear for this layer though!

mikeal commented 4 years ago

IPFS has a lot of APIs so we should be explicit about which ones can and cannot easily leverage this.

The IPFS DAG API will be able to leverage this pretty easily. But, replication and pinning will be problematic because sharing and use of the decryption key would be necessary in order to parse the graph.

IPFS’s File APIs, as they are now, are going to have a hard time leveraging any kind of encryption support. If you look at the folks who have done encryption to date (peergos, textile, etc) they’ve done it at the IPLD layer and then re-built the File APIs on top.

Looking into the future though, the UnixFSv2 migration could introduce some much nicer/easier patterns that integrate dag-cose, and IPFS could natively support encryption in a much more seamless way.

mikeal commented 4 years ago

Just finished the legacy() implementation for the new multiformats interface. https://github.com/mikeal/js-multiformats/commit/27de151bb81d1ed22e5495b24fdf66c754054b23

I’ve gotta write some docs before it’ll be published, but figured ya’ll might want to take a look.

oed commented 4 years ago

Thanks for breaking the api down @mikeal, very helpful! The use of the DAG API is exactly what I'm looking for. Then the key could be shared with peers that should be able to access the data. The file APIs totally makes sense. UnixFSv2 sounds exciting!

Legacy function makes sense 👍

oed commented 4 years ago

Would love to get your feedback on my initial take on the dag-jose codec @mikeal. You can find the code here: https://github.com/oed/js-ipld-dag-jose/blob/master/src/index.ts

Some notes:

Right now it only supports signed objects.

There are a few extra functions for creating, verifying, and decrypting JOSENodes.

a JOSENode has a few extra properties starting with _dagJOSE***. This allows us to keep the payload of the JOSE object at the root of the node.

An alternative to this would be to put the payload in a payload property, but this would force users to use <CID>/payload/path/ to navigate which might be undesirable. Still not sure what the best solution is for this.

Right now these extra properties are enumerable, but they maybe shouldn't be?


When can I start using the legacy wrapper that you created to play around with this new codec in ipld/ipfs btw?

lidel commented 3 years ago

Closing due to the age and inactivity. In the future try leveraging https://discuss.ipfs.io is a better place for asking questions (keeping GitHub limited to actionable issues/bug reports).