ipfs / helia

An implementation of IPFS in JavaScript
https://helia.io
Other
817 stars 81 forks source link

[🏆 Golden path scenario] Browsers can reliably retrieve content from any modern Kubo node providing content #255

Open BigLep opened 10 months ago

BigLep commented 10 months ago

Done Criteria

A user can reliably author/provide content in a local Kubo node behind a NAT, advertise the content in the public IPFS DHT or an IPNI like cid.contact, and have it retrievable via any modern browser (desktop or mobile) via Helia running on a different local network without relying on pinning services.

Why Important

This is a common usecase that users hit. Failure here feeds the narrative that "IPFS doesn't just work".

Content Routing

  1. For content routing (for both public IPFS DHT and IPNI), it’s acceptable to rely on delegated HTTP /routing/v1 from a public endpoint like routing.delegate.ipfs.tbd, cid.contact, etc.

Node Connectivity

  1. a local Kubo node behind a NAT being connected by any modern browser (desktop or mobile) means we can’t rely on WebTransport here given WebTransport can’t be used to dial private nodes and because Safari (necessary for iOS) doesn’t have WebTransport support yet (but will at some point in the future).
    1. That said, using WebTransport as much as possible is encouraged and will undoubtedly be a stepping stone as this end-to-end usecase is flushed out.

Reliability Notes

Reliability is critical here. We need to move beyond demos we scrape together. As example, we want to get to a point where we could some instructions to Juan and not be praying in the audience hoping that it will work. We want to flush out the bugs that happen in a user’s browser under normal loads of multiple tabs, retrieving a range of file sizes, etc. (We know from putting together IPFS Thing 2023 demos together that there are reliability issues.). A key aspect is determining how we’re going to “stress test” this.

To really guarantee reliable retrieval, we should leverage trustless block-by-block fetch over HTTP as a feature, and make it the ultimate fallback enabled by default.

If a Helia node is unable to find CID via delegated routing or fetch from a discovered provider peer, there should be an attempt for raw block fetch from trustless gateways defined in config (or discovered via /routing/v1 and announcing recursive flag via HTTP OPTIONS).

How is this better than what we have with preloads? Uses plain HTTP fetch and a plain block gateway, no special configuration, no need for libp2p stack, very easy to deploy Kubo or implement own backend, if needed.

Isn’t this step back? We want p2p. The idea is to define this as “trustless gateway fallback feature”. It could be a separate project that wraps a helia instance and has additional config. Helia would prefer p2p, but if that fails, will try asking trustless gateway as last resort. We’ve already did something like this for SW gateway; this would be productizing it.

This allows us to iterate on making % of p2p retrievals higher, while giving developers an escape hatch, a way to fall back to self-hosted gateway instead of hard fail for their users.

Caveat (1): to make sure “it just works”, the implicit default would be a trustless gateway provided by PL like we do with bootstrappers. Users MUST be able to override it via config with own list of gateways, allowing self-hosting and scaling independent of PL infra.

Caveat (2): This one is tedious but also easy to do: we’d need to set up trustless-raw-block-only gateway under hostname other than ipfs.io, to avoid safebrowsing errors. This would be 1 line in nginx config similar to this, but for application/vnd.ipld.raw

Testing

Below are some testing ideas (but this warrants its own discussion/doc). We should have a pulse on how much worse from a performance and reliability regard loading content via Helia vs. the trusted ipfs.io gateway. (We should pretend to wear the ipfs.io gateway hat. If we owned that and wanted to shed some of the traffic to be more p2p, what would the user impact be?)

  1. Hook into the existing Probelab website testing. Currently it is using headless Chromum to compare Kubo (via http://localhost:8080/ipns/<website>) vs default HTTP (via https://<website>) (read more). We could add an additional scenario that hits a service worker gateway.
  2. Get a service worker gateway hooked into an experimental Companion build that we start using ourselves and observe anecdotally how things work.
  3. Get a sample of ipfs.io gateway request CIDs and feed them to Helia to load in the browser.

Getting Started

While there are some specific Go tasks to fully wrap up this task listed below, there isn’t anything stopping the JS side from validating and hardening the browser happy path sooner (e.g., testing retrieving content from public WSS / WebTransport multiaddr).

General Notes

  1. Per above, this isn't a pure Helia issue. Tracking the usecase needs to go somewhere though, so I'm putting it Helia for now so we can link against it.

Tasks

### Retrieval guarantees with using the trustless HTTP gateway spec
- [ ] https://github.com/ipfs/helia/issues/272
- [ ] https://github.com/ipfs/helia/issues/274
- [x] determine/secure domain for trustless-raw-block-only gateway: https://github.com/protocol/bifrost-community/issues/1
- [x] Bifrost nginx config for trustless-raw-block-only HTTP gateway domain: https://github.com/protocol/bifrost-community/issues/1
### Tasks for testing reliability
- [ ] https://github.com/ipfs/helia/issues/275
### Tasks for browser-accessible delegated routing
- [x] Exposing /routing/v1 support in Kubo (or an alternative binary)
- [ ] https://github.com/ipfs/helia-routing-v1-http-api/pull/41
- [ ] https://github.com/libp2p/js-http-v1-content-routing/issues/24
- [ ] https://github.com/libp2p/js-http-v1-content-routing/issues/25
- [x] routing.delegate.ipfs.tbd deployed: <https://github.com/protocol/bifrost-infra/issues/2142>
### Tasks for browser to private Kubo retrievability
- [ ] go-libp2p with WebRTC support: https://github.com/libp2p/go-libp2p/issues/2009
- [x] go-libp2p release with WebRTC private-to-private support: https://github.com/libp2p/go-libp2p/issues/2523
- [ ] Enable WebRTC in Kubo: <https://github.com/ipfs/kubo/issues/9724>
- [ ] https://github.com/ipfs/in-web-browsers/issues/211
SgtPooki commented 10 months ago

There is still an issue with nodes in the network not providing public webtransport addresses, we should have kubo/boxo/go-libp2p look into this more seriously. See https://github.com/libp2p/js-libp2p/issues/2040 & https://github.com/libp2p/go-libp2p/issues/2568#issuecomment-1715422596

BigLep commented 9 months ago

For anyone watching this issue, progress is being made. Please see the linked issues from the task lists.