ipfs / specs

Technical specifications for the IPFS protocol stack
https://specs.ipfs.tech
1.16k stars 232 forks source link

IPIP-0445: Option to Skip Raw Blocks in Gateway Responses #445

Open Jorropo opened 11 months ago

Jorropo commented 11 months ago

For #444

rvagg commented 11 months ago

I can see the rationale for this; although it's quite similar to dag-scope=entity, so I wouldn't mind a treatment in the documentation why entity can't properly cover the desired use case, and if it can't, why dag-scope=skip-leaves (or similar) wouldn't be a better approach?

I think we should have a conversation about evolution of trustless now that it's deployed and being actively used in a number of scopes. We have multiple implementations of trustless now, and this adds an extension that is not necessarily in the scope or interests of classes of users or servers, or even if it is, implementation may be delayed or deferred for various reasons. It might be helpful to have some method of feature sniffing built into the spec, perhaps with an OPTIONS pattern so that clients that know they are using newer, less widely deployed, features of trustless, can first sniff before getting a rejection. Alternatively, we could just do a reject with 415 and send a pre-defined header, such as X-Supported-Content-Types, that lists variants of Accept that won't error. If that were in the spec it would be easy to implement; I'd probably take existing Accept parsing that's lax and make it strict, because currently the 2 trustless server implementations I'm responsible for would ignore this new parameter and give you the whole thing; I wouldn't mind tightening that up but we need a better signalling mechanism.

Jorropo commented 11 months ago

I can see the rationale for this; although it's quite similar to dag-scope=entity, so I wouldn't mind a treatment in the documentation why entity can't properly cover the desired use case, and if it can't, why dag-scope=skip-leaves (or similar) wouldn't be a better approach?

Because you need to be able to express a request that is both dag-scope=entity&entity-bytes=42:1337 and has skip-leaves set. (/ entity-bytes implies dag-scope=entity, so can't do dag-scope=skip-leaves&entity-bytes=42:1337):

For example let's say I have a .webm file stored in deserialized form on a non IPFS HTTP server and I am implementing a client that downloads the proofs from a gateway and the leaves from the non IPFS server. Even in this situation features like seeking and ReadAt needs to work if the user wants to seek in the video.

This is great feedback I'll add that to the docs later thx.


I agree on multiplicity of options that could become an issue and we need a way to do signaling. I think #425 is a good place to have this discussion. Imo this IPIP is fine to do without #425 because it's generally useful, the implementation difficulty range from trivial to easy and we are still at a point where the number of implementations is low. It is also not very hard for clients to workaround a server that does not implement this feature by doing block by block requests. In my webseed usecase this is less worst than it sounds since the bulk of the traffic wont go through the gateway and thus wont be done block by block.

hannahhoward commented 11 months ago

FWIW, this webseed like functionality would be really useful for the Saturn case.

If we can let the browser make a request for say an image in flat format, but then verify it with this sort of request, that would be a big win for us.

I do agree about adding new parameters with expectation to serve everywhere being a heavy lift. Love to get DAGHouse folks in here early -- cc: @alanshaw

I would push for something less dag-structure specific -- I'd name the parameter something like "proof-only" -- cause that's what you're asking for, the proof part of the dag, without the data. That makes it more applicable to both Blake3 or eventual non-UnixFS data.

hannahhoward commented 11 months ago

All that said, I'm fundamentally a yes on this IPIP, with the suggestion to change the parameter name to "proof-only" instead of "skip-leaves"

Jorropo commented 11 months ago

All that said, I'm fundamentally a yes on this IPIP, with the suggestion to change the parameter name to "proof-only" instead of "skip-leaves"

Many formats store data in non raw leaves, so this would be making it really unixfs but with --raw-leaves enabled specific given that would be wrong for a dag-cbor HAMT for example.

What do people think about skip-raw instead ? It's self descriptive and correct.

kernelogic commented 10 months ago

Would like to see this implemented as well (in lassie). This can be useful to quickly inspect (and validate, for Fil+) the files in a sector. Without downloading the whole 32GB sector.

The current dag-scope=entity only covers the first level, currently will have to recursively drill down to get more information - involving a lot more requests.

hannahhoward commented 9 months ago

One other thing to consider, though probably not in this IPIP -- option to not traverse files while recursively following a UnixFS Directory tree.

rvagg commented 9 months ago

well, that could be included in here because it's quite similar - skip=none|files|leaves, being able to ls -alR and only get the directory entries would be pretty neat