Open adraffy opened 3 months ago
I think this seems reasonable, though novel. I'm not so sure about introducing a new tag, data
for this though. Would namespace
as well for that be OK? Even that doesn't map super cleanly onto what you're doing here.
Do you think you'll want more of these into the future? I wonder if we can't figure out a better tag whether this should just be an entirely new classification.
@vmx, what do you think?
I wonder if URI could use a Multiaddress instead. Would that be an option (I know to little about the Eth/ENS ecosystem).
namespace
works. I'd be happy to change it to whatever you suggest.
IMO, the closest codec is json
which oddly uses tag:ipld
.
I picked tag:data
as unlike most codecs, data-uri
is both a codec and the data itself.
I think tag:multiaddr
for uri
suggests too much internal encoding, as we want something maximally general (a literal UTF-8 string) where the content is ultimately validated by the client (since URL standards are ever-evolving)
I think
tag:multiaddr
foruri
suggests too much internal encoding, as we want something maximally general (a literal UTF-8 string) where the content is ultimately validated by the client (since URL standards are ever-evolving)
Keeping it simple makes sense.
IIUC this is related to https://discuss.ens.domains/t/draft-ensip-17-datauri-format-in-contenthash/18048/28 and https://github.com/ensdomains/docs/pull/165.
Apologies for the long text, I'm going to be OOO for a couple days and wanted to make sure to leave some context. cc @lidel who has been involved in the ENS work and interop here since long before me 😅.
TLDR:
Some thoughts:
I wonder if URI could use a Multiaddress instead
Probably not multiaddress itself, but harmonization with something like multipath https://github.com/multiformats/multiformats/pull/55 would likely make this work and be pretty sensible. It would likely also let us use the 0x2f
as an escape hatch for people generally wanting to use/experiment with strings rather than code numbers which is what this roughly does (otherwise, the codes like for http could potentially be used instead).
FWIW libp2p has recently proposed going the other way as well (i.e. representing multiaddrs as URIs https://github.com/multiformats/multiaddr/pull/171).
I don't in principle have an objection to a URI based namespace, the two byte range is probably fine although URIs could probably tolerate even three due to the size of the data.
Perhaps more of an ENS-related comment, but want to call out:
Seems fine, although maybe the three byte range (along with arweave, skynet, etc.) makes more sense here given these will likely be larger anyhow.
A few comments / thoughts:
contenthash
records that are IPFS-based this seems like something we could/should fix or the hack within ENS (whether in ENS or the "contenthash" namespace could fix either)IMO, the closest codec is json which oddly uses tag:ipld.
everything is IPLD 😄
🙏 everyone, I'm one of author of that data:uri ENSIP draft proposal, https://discuss.ens.domains/t/draft-ensip-17-datauri-format-in-contenthash/18048 using simple namespace hex("data:") format.
We did our homework before sending draft over ENS forum to make an exception for hex("data:") prefix for reasons below..
a) mime/content type support in cidv1 is pending for loong time (?wen cidv2?)
https://github.com/multiformats/multicodec/pull/159 https://github.com/multiformats/multicodec/issues/4
b) ENS already supports string(data:uri) format in avatar records, so contenthash with plaintext bytes(data:uri) as hex("data:") namespace is full RFC2397 & it won't collide with cidv1 namespaces. https://datatracker.ietf.org/doc/html/rfc2397
if(contenthash.startsWith("e301")){
//ipfs
} else if(contenthash.startsWith("e501")){
//ipns
}
// else... other contenthash namespaces...
else if(contenthash.startsWith(hex("data:"))){
//datauri
}
ENS is not ready for such changes with new ENSIP specs, all contenthash MUST follow namespace+CIDv1 format. && we're back to square one, using raw data in cidv1 with IPFS namespace.
our current working specs for on-chain raw IPFS+CIDv1 generator without content/mime types..
import { encode, decode } from "@ensdomains/content-hash";
import { CID } from 'multiformats/cid'
import { identity } from 'multiformats/hashes/identity'
//import * as cbor from '@ipld/dag-cbor'
import * as json from 'multiformats/codecs/json'
import * as raw from 'multiformats/codecs/raw'
const utf8 = new TextEncoder()
const json_data = {"hello":"world"}
const json_cid = CID.create(1, json.code, identity.digest(json.encode(json_data)))
JSON/cidv1 >> 01800400117b2268656c6c6f223a22776f726c64227d https://ipfs.io/ipfs/bagaaiaarpmrgqzlmnrxseorco5xxe3deej6q
ENS contenthash with IPFS namespace : 0xe30101800400117b2268656c6c6f223a22776f726c64227d eth.limo tests : https://e3010180040011.7b2268656c6c6f223a22776f726c64227d.ipfs2.eth.limo https://bagaaiaarpmrgqzlmnrxseorco5xxe3deej6q.ipfs2.eth.limo/
const html_data = "<h1>Hello World</h1>";
const html_cid = CID.create(1, raw.code, identity.digest(utf8.encode(html_data)))
HTML/cidv1 >> 015500143c68313e48656c6c6f20576f726c643c2f68313e https://ipfs.io/ipfs/bafkqafb4nayt4sdfnrwg6icxn5zgyzb4f5udcpq
ENS contenthash with IPFS namespace : 0xe301015500143c68313e48656c6c6f20576f726c643c2f68313e eth.limo tests : https://e30101550014.3c68313e48656c6c6f20576f726c643c2f68313e.ipfs2.eth.limo/ https://bafkqafb4nayt4sdfnrwg6icxn5zgyzb4f5udcpq.ipfs2.eth.limo/
This all works ok using json/raw data.. only down side, there's no content/type in CIDv1 so we've to parse/guess magic bytes in raw data on client side OR request ipfs gateways to resolve that.
we can even use dag-cbor to link multiple files/ipfs cids.. but on public ipfs gateways there's no index file and ipfs __redirect supported. we've to happily decode that on our "smart" clients for now.
const blog = CID.parse("bafybeidnycldkehcy6xixzqg72vad6pitav4lk5np3ev6tr6titlkvfpvi") let link = { json: json_cid, "/": html_cid, "index.html": html_cid, blog: blog } let cbor_link = CID.create(1, cbor.code, identity.digest(cbor.encode(link)))
Back to @adraffy's f3
namespace, I'd suggest this format..
const data_uri = "data:text/html,<html>hello</html>";
const data_cid = CID.create(1, raw.code, identity.digest(utf8.encode(data_uri)))
01 - 55 - 00 - 21 - 646174613a746578742f68746d6c2c3c68746d6c3e68656c6c6f3c2f68746d6c3e v1 - codec/raw - hash/none - varint.encode(datauri.length) - utf8 datauri https://ipfs.io/ipfs/bafkqailemf2gcotumv4hil3iorwwylb4nb2g23b6nbswy3dphqxwq5dnnq7a ENS contenthash with data-uri "f3" namespace : 0xf30101550021646174613a746578742f68746d6c2c3c68746d6c3e68656c6c6f3c2f68746d6c3e
@aschmahmann and @0xc0de4c0ffee thanks for the feedback.
As for codec numbers, I'd be happy with any assignment. Initially picked lower numbers since these two codecs seem useful beyond ENS.
Yes, you could put both ipfs://...
and data:...
into uri
however there is a difference w/r/t how they are handled and interpreted. These details were not included as they are ENS application-specific, but possibly the codec names should reflect that, eg. Redirect URI
.
From the ENS + web content perspective:
ipfs
is that the content is on IPFS and the server would know how to decipher the CID and serve directory-like dags from a single root hash using whatever IPFS gateway (likely their own node) to fetch the contenturl
is that the server would blindly HTTP 307
with no processing
https://raffy.eth.limo/
) would disappear ipfs:
would fail without a specific handler for that schemehttps://ipfs.io/ipfs/...
would work but force an explicit gatewayitms-apps:
, spotify:
, etc.data-uri
is that the server would serve the content as a static file
text/html
with an embedded <script>
can parse the window.location
application/pdf
with #page=7
can jumpYou are correct about the base64 overhead concern, but there is also URL length limits (vs body)
Coffee, I put your response on ENS forum
ENS (Ethereum Name Service) encodes
contenthash()
using multicodec. The purpose of acontenthash()
is to describe the web contents for a corresponding ENS name.Currently, ENS supports IPFS, IPNS, Swarm, Arweave, Onion, etc.
Example using IPFS:
vitalik.eth
https://vitalik.eth.limo/
contenthash()
=0xe30101701220484da2f7f497cac307e2026282263630b8dd4c448c3436470f5b850b432ba868
ipfs://k2jmtxt5zh5vu5y8r7em2che3d4ghyftfr6h1yofdhibxai88k1wj5uw
0xE3
correspond to multicodecipfs
, the following bytes are a CIDv0We would like to support the following (2) new codecs:
0xF2
— URI0xf268747470733a2f2f656e732e646f6d61696e732f
<codec><uri: utf8-string>
https://ens.domains/
0xF3
— Data URL0xf309746578742f68746d6c3c68746d6c3e68656c6c6f3c2f68746d6c3e
<codec><len(mime): uint8><mime: ascii-string><data: uint8[]>
mime
=text/html
(9ch)data
=<html>hello</html>
(encoding depends on mime)