ipshipyard / waterworks-community

Discussion and documentation concerning the operation of the IPFS HTTP Gateway at https://ipfs.io/ipfs.
MIT License
0 stars 0 forks source link

Expose trustless gateway and delegated routing under new domains with certain policies #1

Closed BigLep closed 2 months ago

BigLep commented 8 months ago

Background

As part of https://github.com/ipfs/helia/issues/255, there is the need for:

  1. (trustless gateway) an endpoint that only exposes the trustless gateway functionality
  2. (delegate routing) an endpoint that only exposes /routing/v1

Both of these enable reliable retrieval from the browser in different ways:

  1. (trustless gateway) Fallback to a trustless gateway that isn't (and can't be) soiled like ipfs.io is (thus avoiding browser red screen)
  2. (delegate routing) Empowering a browser node to do p2p retrieval

Kubo 0.23+ supports all of this functionality, which means the existing "ipfs.io" fleet can be used.

These efforts have similar work and so are being grouped together for efficiency:

  1. Secure a domain
  2. Add TLS cert to nginx LB
  3. Setup proper nginx config concerning paths/headers/caching

Trustless Gateway requirements/suggestions:

nginx config that only allows responses with one of these content types:

  1. https://www.iana.org/assignments/media-types/application/vnd.ipld.raw
  2. https://www.iana.org/assignments/media-types/application/vnd.ipld.car
  3. https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record

Caching: ???

Delegate routing needs/suggestions

Some more context is in https://github.com/protocol/bifrost-infra/issues/2142

Specific suggestions are in https://github.com/protocol/bifrost-infra/issues/2758#issuecomment-1716761794

Tasks

BigLep commented 8 months ago

Concerning domain names, we had said trustless-gateway.link

That said, I'm wondering if we instead want an overarching domain for IPFS-related network utilities that we can subdomain underneath. The parent domain can function as a landing page (or redirect to the right docs).

Ideas:

(I went with .org but good to go with another tld)

@lidel: do you have feedback/suggestions here?

lidel commented 8 months ago

For trustless gateway, the same caching config as we use for existing ipfs.io gateway will suffice. (Responses already have correct Cache-Control HTTP headers, nginx cache can leverage them)

For delegated routing at /routing/v1 we should set up nginx cache with different expiration for hits and misses, but not sure what are the ideal values yet. We can start with values from https://github.com/protocol/bifrost-infra/issues/2758#issuecomment-1716761794 and see how it goes.

IPFS-related network utilities that we can subdomain underneath

Putting all eggs in one DNS basket runs into similar risk ipfs.io was under – ISPs or "antivirus" blocking entire domain kills unrelated services that are on subdomains. There is also a general risk of human error shutting down entire infrastructure due to DNS mishap, like we've seen in Saturn.

The main risk is around hosting third-party data. It is way way lower with limiting things to trustless responses, but given enough time, is non-zero.

:point_right: @BigLep Due to this we should keep separate trustless-gateway.link and just to be safe, also separate delegated-ipfs.dev for routing when we hard-code it as implicit default in Helia and other places (requests for delegated-ipfs.dev/routing/v1 in dev-tools will be self-explanatory, we have "delegated" "ipfs" and "routing" in the URL, which is good UX)

ps. I really like "IPFS Waterworks" as potential nucleation name for dev/infra team (ipfs-waterworks.dev?) :) but it feels separate from trustless gateway and delegated routing needs we discuss in this issue

BigLep commented 8 months ago

2023-10-26 maintainer conversation: We need to give CURL or CLI commands so that @ns4plabs can verify This conversation will happen in #bifrost-community

BigLep commented 7 months ago

2023-10-31 maintainer discussion about docs

(@BigLep will flush this out further)

General delegated routing

BigLep commented 7 months ago

I cleaned up the minimum done criteria around docs.

A better treatment of delegated content routing docs is started in https://github.com/ipfs/ipfs-docs/issues/1752

cewood commented 5 months ago

Per the planning sync in 2024-01-04-IPFS-Shipyard-Kubo-0-26-planning we decided to repurpose the preload staging nodes to run someguy, and point the delegated routing domains to these nodes. If appropriate, we can split that into a separate ticket, or just handle it here.

2color commented 5 months ago

Update from @ns4plabs:

The following branch of Someguy is deployed to a new host (not replacing any of the preload nodes) and is available on https://delegated-ipfs.dev/routing/v1

2color commented 5 months ago

Main challenge right now is that the endpoint is slow, e.g. 8 seconds for the following request: https://delegated-ipfs.dev/routing/v1/providers/bafkreia2xtwwdys4dxonlzjod5yxdz7tkiut5l2sgrdrh4d52d3qpstrpy and there's no caching in place.

2color commented 5 months ago

@ns4plabs added the cache-control header with the max-age=60 to the reverse proxy for the hosted version of this. I think this is fine for now, since this will also apply to empty responses, which we'd want cached for only 15 seconds.

2color commented 5 months ago

For us to intelligently set the cache-control (depending on wether the response is empty or not), we'd need to either:

More info here https://github.com/ipfs-shipyard/someguy/issues/26

2color commented 4 months ago

trustless-gateway.link still doesn't work when passing the format GET parameter. It only works when passing the Accept header.

This has two implications:

Example:

curl -i "https://trustless-gateway.link/ipfs/bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4?format=raw"
HTTP/2 400
server: openresty
date: Wed, 31 Jan 2024 14:44:49 GMT
content-type: text/html
content-length: 154
x-ipfs-datasize: 154
x-ipfs-lb-pop: gateway-bank2-fr2
x-bfid: 117b28ca94345718171f0e92c856146d
strict-transport-security: max-age=31536000; includeSubDomains; preload

<html>
<head><title>400 Bad Request</title></head>
<body>
<center><h1>400 Bad Request</h1></center>
<hr><center>openresty</center>
</body>
</html>

I think it'd be quite important to add support passing the format GET parameter

lidel commented 4 months ago

We should move filtering from Nginx to rainbow. Proposed solution in https://github.com/ipfs/rainbow/issues/71 and https://github.com/ipshipyard/waterworks-infra/issues/9