Closed Stebalien closed 6 years ago
Note: this also looks like a nice place to plugin a real namespaced resolver system (yay).
The resolver seems to assume that paths always resolve to CIDs (can't end up in the middle of an object).
Related to: https://github.com/ipld/specs/issues/3
Somewhat related https://github.com/ipfs/go-ipfs/pull/4643#discussion_r165768700
I haven't touched paths in CoreAPI yet, I'll try to make a draft PR tomorrow, addressing these and some other points.
Awesome!
Response to TODOs in #4672 but contains a lot if discussion on future work so I figured I'd put it here.
Figure out default namespaces
Getting this right is really important. I can see a few solutions but we need to discuss this thoroughly.
That is, we could introduce a new /cid/
namespace that defines no path resolution algorithm. /cid/QmId
would be valid but /cid/QmId/path
would not. I think this is fine as I don't think QmId/path
(no namespace) is permitted.
Not sure if I'm a fan of this...
Again, given that we're not doing any path resolution, the namespace doesn't really matter (that much). IPLD is the most general purpose as all valid UnixFS objects are IPLD objects.
This would mean that the namespace is only concerned with path resolution and shouldn't be used to interpret the target value.
/ipld/QmId/ipld/path/to/thing
. ipfs cat /ipld/QmId/path/to/thing
to return a CAR of the target dag. However, I'm fine with this not working./ipld/QmId
to the user when they expect /ipfs/QmId
(nasty).We could make the default namespace an option to ParsePath
, yielding an error if no default namespace was specified and ParsePath
was passed a raw "thing".
Alternatively, we could combine the IPLD namespace solution and this solution. If no default is passed, we default to IPLD.
We could even make this option a function with the signature func(string) Path
. That would allow maximum flexibility (and allow the function to, possibly, spit out a warning).
Personally, I'd vote for this option (with the fallback to IPLD) as it has all the advantages of the IPLD option but is slightly more flexible.
We may just want to set a "resolved" bit somewhere and use that to determine if the path has been resolved. We also need to address https://github.com/ipld/specs/issues/3
We may, eventually, have other mutable namespaces (maybe?). I'd like to just provide a method Mutable()
. Internally everything may just return false except IPNS but I don't think it'll cost us much and may make extending things easier in the future.
I'd really just like a global path resolution registry that maps namespaces to resolvers. But yeah, this can come later.
Ideally, I'd like to introduce the concept of "mounting" all of our namespaces inside MFS (although I haven't thought through all of the ramifications). Then, we'd just hand the path off to MFS and let it do the lookup. Although, we could also just have MFS "mount" all namespaces (i.e., mount the resolver).
(If I could break things, I would introduce /local/{ipfs,ipld}/
namespaces (or maybe just /local
) to scope off all non-globally addressable stuff but... oh well.)
So... Let the debate begin!
Having the command choose is something that should work fine - we can still give the user an option to choose which resolver to use if we add another Resolve[..]
function, eliminating the need for /ipns/ipld
namespace:
ParsePath(string)
just parses a string to a generic pathResolveFile(Path)
resolves a path using unixfs resolverResolveData(Path)
resolves a path using ipld resolverParse(unixfspath)
-> ResolveFile
-> AppendPath(Path, strpath)
-> ResolveData
Pros:
A con here is that paths can represent 2 different things depending on the resolver/command used.
/ipld
namespace without support for ipns(edit: cleaned up some wordings)
ResolveFile(Path)
resolves a path using unixfs resolver.
Having two separate functions, one for IPFS and one for IPLD, both of them taking a path seems a bit odd. Won't the path have the namespace so shouldn't the choice be unambiguous at this point? Also, I'd like the resolver to be fully pluggable. Ideally, the top-level resolver wouldn't understand any of our pathing schemes, just the concept of pathing/namespaces (and maybe CIDs because they'll be baked into most of our pathing schemes).
eliminating the need for
/ipns
namespace.
I must be missing something here. We do not want to get rid of the /ipns
namespace.
Don't need more namespaces.
I agree that more namespaces isn't the way to go. I mentioned the /cid
namespace more for completeness. IMO, that's the "technically correct" approach but not really a usable approach in practice.
No hacks needed
I don't consider any of the solutions I provided to be hacks (well, the /cid
namespace one is a bit funky but it's still consistent and well defined).
I see this as a less of as issue than having separate
/ipld
namespace which support with ipns
I'm not sure if I understand your point here. We plan on having a /ipld
namespace regardless.
However, I guess I never really explained my concern with regard to IPNS... The issue with IPNS is that, given ParsePath("QmId...")
, it's impossible to determine if QmId
is a peer ID (IPNS key) or a v0 CID. All of my proposed schemes support resolving IPNS, they just (except for the last one) run into issues when asked to resolve QmId...
(due to the lack of a namespace).
eliminating the need for /ipns namespace.
I must be missing something here. We do not want to get rid of the /ipns namespace.
Oh, I meant /ipld
, sorry for the confusion, typo..
The point is was that with /ipld
we can't easily resolve IPNS names using IPLD resolver.
So I dug into IPNS code to find that it works on path level, not on CIDs, this changes my perspective a bit.
However, I guess I never really explained my concern with regard to IPNS... The issue with IPNS is that, given ParsePath("QmId..."), it's impossible to determine if QmId is a peer ID (IPNS key) or a v0 CID. All of my proposed schemes support resolving IPNS, they just (except for the last one) run into issues when asked to resolve QmId... (due to the lack of a namespace).
The current default is to use /ipfs
namespace by default. I think this is fine as long as this is documented. This is the behaviour of most of the commands (except ipfs name resolve
..).
With the ability to publish /ipld
paths to IPNS I think it's fine to go that way. To address some concerns from the comment above:
Could end up leaking /ipld/QmId to the user when they expect /ipfs/QmId (nasty).
The only API returning /ipld
prefix would be the Dag API. If we keep the boundary between (linked-)data and files clearly defined, this shouldn't become a problem.
Con: Doesn't handle IPNS.
Well, IPNS can handle /ipld
paths (it will need some changes to make it work, but it can on the protocol level). Only problem I can see here is that /ipns
paths can get resolved in 2 different ways depending on what they resolve to, though I can't think of any example use-case that would be affected by that.
So... I omitted a ton of context in that proposal.
The current default is to use /ipfs namespace by default. I think this is fine as long as this is documented.
This becomes a problem with all the dag
and block
commands that usually operate over IPLD dags. This is actually a really nasty issue that has bitten us multiple times (mostly because we haven't yet implemented the /ipld
namespace).
However, as the "IPLD namespace" proposal pointed out, we never have paths of the form QmId/path
, only /namespace/QmId/path
and QmId
(no path). Therefore, we don't care about the path resolution algorithm except where IPNS
is concerned (because, in this case, the QmId
is not a CID). Therefore, the choice of IPLD/IPFS isn't really important when unspecified (as long as we're not dealing with IPNS). That proposal suggested /ipld
because /ipld
is more general purpose. However, at the end of the day, it doesn't matter because. Internally, the resolver won't "walk the path" because there will be no path (so either way will work).
Could end up leaking /ipld/QmId to the user when they expect /ipfs/QmId (nasty).
The only API returning /ipld prefix would be the Dag API. If we keep the boundary between (linked-)data and files clearly defined, this shouldn't become a problem.
My concern, in this case, was that some error message might state /ipld/QmId
and confuse the user (even though the choice of /ipld
or /ipfs
doesn't really matter in this case). That is, in the "no path so we don't care the which path resolution algorithm we use" case, if we choose /ipld
when the user expects /ipfs
, some error message may tell the user that we tried to resolve /ipld/QmId
and confuse them.
Note: Simply choosing /ipfs/QmId
doesn't actually fix this as we'll have the reverse problems for commands like ipfs dag
etc...
Con: Doesn't handle IPNS.
Well, IPNS can handle /ipld paths (it will need some changes to make it work, but it can on the protocol level).
All my comments about "doesn't handle IPNS" were referring to the fact that we can't determine id QmId...
is an IPNS key or a CID and the correct interpretation depends on the command. The two proposals with the "doesn't handle IPNS" caveat didn't provide a way to specify a default namespce (no way for the command to tell the resolver "if you're passed a path in the form QmId...
, use namespace X".
All my comments about "doesn't handle IPNS" were referring to the fact that we can't determine id
QmId...
is an IPNS key or a CID and the correct interpretation depends on the command. The two proposals with the "doesn't handle IPNS" caveat didn't provide a way to specify a default namespce (no way for the command to tell the resolver "if you're passed a path in the formQmId...
, use namespace X".
Currently you can't create a path without a prefix using CoreApi [1].
name.Resolve
, which is afaik the only function which by-default prefixes things with /ipns/
, currently is passed a string instead of a Path (on a second thought I'm not sure if it was a right decision).
You can't pass unprefixed path to any coreapi function. When porting the commands to use coreapi, the default prefix will have to be attached in the command layer. If we enforce this, then the QmId...
shouldn't be a problem.
Currently you can't create a path without a prefix using CoreApi [1].
But you can call ParsePath("QmId...")
(no prefix, just a string).
But you can call ParsePath("QmId...") (no prefix, just a string).
Yes, the current default is to parse this as a CID and prefix with /ipfs/
. We can make it return errors for unprefixed strings. This would give us the flexibility to, in future, interpret the paths in context of MFS without breaking the compatibility too badly.
To push this somewhere:
This is how I see the /ipld
variant:
/ipfs
/ipld
paths/ipns
paths/ipfs
pathsBy explicitly disallowing unprefixed paths we don't have the ParsePath("QmId...") ambiguity. As this is an CoreAPI-only thing, we can still maintain current command behaviour quite easily.
Problems: MFS is weird in this case. We can either - add /mfs
namespace (imo ugly) - or - pass string paths to files
api (even uglier?)
But - what if we treated all unprefixed paths as MFS paths? In this case MFS would provide a resolver. We still can have /ipld
paths.
Downsides:
/
and if it's not, prepend some default. This is a bit hacky so if we decide to do this, it should be there for a limited time (we should deprecate unprefixed paths in commands as-well)files
api, at least not anytime soonParseCid prefixes paths with
/ipfs
I'm not really comfortable with this as it implies that IPFS is somehow the "correct" namespace. Maybe rename the function to IpfsPath(cid)
or NewPath(ns string, cid *cid.Cid)
?
But - what if we treated all unprefixed paths as MFS paths? In this case MFS would provide a resolver. We still can have /ipld paths.
I'd almost just go with the "pass strings" option. For MFS, paths really are just strings. Internally, the files API would parse the string into a path and do any necessary resolution.
Personally, I'd have liked to have had a separate /local
prefix for mfs but...
So I did a bit of poking in https://github.com/ipfs/go-ipfs/pull/4672 locally. I think that /local
for non-global (mfs) stuff might actually be fine. We aren't breaking anything since coreapi doesn't have files api yet, and when in future the files
commands will be rewritten to copeapi we can wrap them to use the /local
namespace.
I think I'll go with this and see how it turns out. Also, note that if we would want to use mfs as the global namespace, we should reserve /ipld
and /local
namespaces asap (it would be a breaking change)
coreapi.ParsePath
doesn't actually extract the root.