ipfs / specs

Technical specifications for the IPFS protocol stack
https://specs.ipfs.tech
1.15k stars 232 forks source link

Manifest files for unixfs loaded via HTTP Gateway #257

Open lidel opened 2 years ago

lidel commented 2 years ago

This is a placeholder issue for creating a spec for "content root's manifest files" which would be part of the data published on IPFS (eg. /ipfs/CID/_something or /ipfs/CID/.well-known/ipfs/foo) that provides hints to HTTP Gateway around things like redirects, content types and custom HTTP headers.

Related reading:

Possible reuse of:

Feel free to discuss/post ideas below, or open PRs with IPIP drafts that could be reviewed independently.

BigLep commented 2 years ago

Related are notes from the 2021-11-16 IPFS operators meeting: https://hackmd.io/MiYYUCR-RWqhAKfJ_B15wA

justindotpub commented 2 years ago

Hi @lidel. I started working at Fission this year and have recently been tasked with implementing redirects support in go-ipfs. @cbrake started working on this in this PR 🙏 and I plan on continuing the work in this PR.

If you don't mind, I'd like to make sure I understand the process here as far as specs and issue management go so I don't cause any annoying friction.

From the issues I've read through it appears that it was decided to split out redirects support from manifests support when Cloudflare announced Cloudflare Pages with _redirects support, similar to Netlify's _redirects support. This issue I'm commenting on appears to be specific to spec'ing out manifest support. Should redirects support get its own spec and therefore spec issue or were you thinking it would be included in the same spec?

Regardless of whether or not manifests and redirects are separate specs, my assumption is that the redirects portion of the spec will both inform and be informed by the go-ipfs redirects support tentative implementation. Does that sound right? Is there any preferred order to tackling this (spec vs implementation first)? I'm currently assuming I should start with essentially implementing what Cloudflare supports and see what feedback others provide, adjust for feedback, and then eventually work backward from there into text for the spec.

Assuming I'm on the right track, I was planning on submitting 1) an issue in go-ipfs to track this implementation work vs just my draft PR, and 2) possibly an issue in this specs repo to track the redirects portion of the spec, unless it should all be included in this spec issue.

Thank you in advance for any feedback and guidance. If you have any other tips on the code base or any plans you had for how the code should be reorged (I see you've been doing some refactoring recently) I'd love to hear them.

Thank you! 🙏

BigLep commented 2 years ago

Hi @justincjohnson. Thanks for reaching out here. @lidel is great at seeing GitHub notifications and processing them. I don't know how many others are though. If you don't get a response in the next day or so, feel free to start a thread in #ipfs-operators in FIL Slack or #ipfs-dev in IPFS Discord.

lidel commented 2 years ago

Hi @justincjohnson, really appreciate reaching out!

I believe tldr from the prior discussions is that we translated requirements into two sibling features that will improve the devexp around unixfs website hosting using IPFS gateway:

Feel free to work on both, but focusing on _redirects first is a good idea. I am happy to guide / help with reviews. :+1:

the redirects portion of the spec will both inform and be informed by the go-ipfs redirects support tentative implementation. Does that sound right? Is there any preferred order to tackling this (spec vs implementation first)? I'm currently assuming I should start with essentially implementing what Cloudflare supports and see what feedback others provide, adjust for feedback, and then eventually work backward from there into text for the spec. [..] I was planning on submitting 1) an issue in go-ipfs to track this implementation work vs just my draft PR, and 2) possibly an issue in this specs repo to track the redirects portion of the spec, unless it should all be included in this spec issue.

Yes, this plan sounds good.

There are caveats around both headers and redirects that we will fully understand while writing sharness tests for them, so specs may change multiple times before we finish the mvp implementation. It is fine to keep discussion around go-ipfs PR and wait with specs until the dust settles.

Should redirects support get its own spec and therefore spec issue or were you thinking it would be included in the same spec?

We have various gateway improvements in flight, so it is better to have separate specs for now. When you feel like writing the spec, just open a PR against ipfs/specs that adds it as a standalone file under gateways/REDIRECTS_FILE.md – this way we can get it done without being blocked on anything.

ps. I left some quick comments in https://github.com/ipfs/go-ipfs/pull/8816

justindotpub commented 2 years ago

Fabulous! Thanks for the guidance @lidel. I'll get started.

justindotpub commented 2 years ago

For the benefit of anyone coming here looking for information on redirects support, see https://github.com/ipfs/go-ipfs/pull/8890.

eth-limo commented 3 months ago

@lidel a _headers file would be fantastic. At eth.limo, we've had multiple users request the addition/removal/modification of the following (user defined) headers:

User defined headers

Header Rationale
content-security-policy useful not only for iframing protection but also for better dependency/resource security and XSS defense. Ideally dApps/dWebsites should have whitelisted domains and in-line script hashes within the html bundle. Encouraging better version pinning could help defend against things like the Ledger JS library compromise from last year.
cross-origin-resource-policy best practices for cross-origin isolation. Additionally, advanced features such as SharedArrayBuffer require one or more cross-origin-* headers to be properly configured. See: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer
cross-origin-opener-policy ^^^
cross-origin-embedder-policy ^^^
permissions-policy Should be customizable by the content publisher, depending on what features are required.
x-frame-options Configurable depending upon the need for iframing. However, this can also be addressed via the content-security-policy frame-ancestor directive.
x-xss-protection legacy, but can provide a "belt and suspenders" approach for XSS protections in conjunction with a well defined content-security-policy. At a minimum, only the 1; mode=block directive should be configurable due to potential vulnerabilities arising from botched filtering: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-XSS-Protection#vulnerabilities_caused_by_xss_filtering
referrer-policy useful for dApps/dWebsites interested in some privacy mitigations for public gateways.

This would greatly improve the functionality of static dWebsites delivered locally, or through a gateway.

Gateway operator defined headers

There are many more, but this is a good starting point. These should not be configurable by the end user as they can globally impact the operation and quality of the service.

Header Rationale
strict-transport-security as mentioned elsewhere, this is really designed for the gateway operator to configure. It has global ramifications and should not be configurable by content owners.
clear-site-data another header that can potentially alter top-level origin settings. Ideally the gateway operator would configure this to automatically wipe cookies issued by other subdomains in the absence of inclusion on the Public Suffix List (PSL)
cache-control likely something the gateway operator should control since caching is directly related to resource usage (bandwidth, etc...)
lidel commented 3 months ago

Thank you, useful feedback. Agree, allowing arbitrary headers is risky, especially in the future when new things are added to the web platform.

To get this moving froward, we could start with a safelist of allowed headers. Ones that allow website owners to adjust security for their Origin sgtm.

For next steps here, interested parties could

If anyone is interested in kicking this off and providing implementation and specs, I am willing to lend necessary review time. :+1: Sponsoring this work via https://ipshipyard.com/contact-us is also an option.