ipshipyard / waterworks-community

Discussion and documentation concerning the operation of the Public Goods for IPFS and Libp2p.
https://docs.ipfs.tech/concepts/public-utilities/
MIT License
2 stars 1 forks source link

Expose trustless gateway and delegated routing under new domains with certain policies #1

Closed BigLep closed 6 months ago

BigLep commented 1 year ago

Background

As part of https://github.com/ipfs/helia/issues/255, there is the need for:

  1. (trustless gateway) an endpoint that only exposes the trustless gateway functionality
  2. (delegate routing) an endpoint that only exposes /routing/v1

Both of these enable reliable retrieval from the browser in different ways:

  1. (trustless gateway) Fallback to a trustless gateway that isn't (and can't be) soiled like ipfs.io is (thus avoiding browser red screen)
  2. (delegate routing) Empowering a browser node to do p2p retrieval

Kubo 0.23+ supports all of this functionality, which means the existing "ipfs.io" fleet can be used.

These efforts have similar work and so are being grouped together for efficiency:

  1. Secure a domain
  2. Add TLS cert to nginx LB
  3. Setup proper nginx config concerning paths/headers/caching

Trustless Gateway requirements/suggestions:

nginx config that only allows responses with one of these content types:

  1. https://www.iana.org/assignments/media-types/application/vnd.ipld.raw
  2. https://www.iana.org/assignments/media-types/application/vnd.ipld.car
  3. https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record

Caching: ???

Delegate routing needs/suggestions

Some more context is in https://github.com/protocol/bifrost-infra/issues/2142

Specific suggestions are in https://github.com/protocol/bifrost-infra/issues/2758#issuecomment-1716761794

Tasks

BigLep commented 1 year ago

Concerning domain names, we had said trustless-gateway.link

That said, I'm wondering if we instead want an overarching domain for IPFS-related network utilities that we can subdomain underneath. The parent domain can function as a landing page (or redirect to the right docs).

Ideas:

(I went with .org but good to go with another tld)

@lidel: do you have feedback/suggestions here?

lidel commented 1 year ago

For trustless gateway, the same caching config as we use for existing ipfs.io gateway will suffice. (Responses already have correct Cache-Control HTTP headers, nginx cache can leverage them)

For delegated routing at /routing/v1 we should set up nginx cache with different expiration for hits and misses, but not sure what are the ideal values yet. We can start with values from https://github.com/protocol/bifrost-infra/issues/2758#issuecomment-1716761794 and see how it goes.

IPFS-related network utilities that we can subdomain underneath

Putting all eggs in one DNS basket runs into similar risk ipfs.io was under – ISPs or "antivirus" blocking entire domain kills unrelated services that are on subdomains. There is also a general risk of human error shutting down entire infrastructure due to DNS mishap, like we've seen in Saturn.

The main risk is around hosting third-party data. It is way way lower with limiting things to trustless responses, but given enough time, is non-zero.

:point_right: @BigLep Due to this we should keep separate trustless-gateway.link and just to be safe, also separate delegated-ipfs.dev for routing when we hard-code it as implicit default in Helia and other places (requests for delegated-ipfs.dev/routing/v1 in dev-tools will be self-explanatory, we have "delegated" "ipfs" and "routing" in the URL, which is good UX)

ps. I really like "IPFS Waterworks" as potential nucleation name for dev/infra team (ipfs-waterworks.dev?) :) but it feels separate from trustless gateway and delegated routing needs we discuss in this issue

BigLep commented 1 year ago

2023-10-26 maintainer conversation: We need to give CURL or CLI commands so that @ns4plabs can verify This conversation will happen in #bifrost-community

BigLep commented 1 year ago

2023-10-31 maintainer discussion about docs

(@BigLep will flush this out further)

General delegated routing

BigLep commented 1 year ago

I cleaned up the minimum done criteria around docs.

A better treatment of delegated content routing docs is started in https://github.com/ipfs/ipfs-docs/issues/1752

cewood commented 10 months ago

Per the planning sync in 2024-01-04-IPFS-Shipyard-Kubo-0-26-planning we decided to repurpose the preload staging nodes to run someguy, and point the delegated routing domains to these nodes. If appropriate, we can split that into a separate ticket, or just handle it here.

2color commented 10 months ago

Update from @ns4plabs:

The following branch of Someguy is deployed to a new host (not replacing any of the preload nodes) and is available on https://delegated-ipfs.dev/routing/v1

2color commented 10 months ago

Main challenge right now is that the endpoint is slow, e.g. 8 seconds for the following request: https://delegated-ipfs.dev/routing/v1/providers/bafkreia2xtwwdys4dxonlzjod5yxdz7tkiut5l2sgrdrh4d52d3qpstrpy and there's no caching in place.

2color commented 10 months ago

@ns4plabs added the cache-control header with the max-age=60 to the reverse proxy for the hosted version of this. I think this is fine for now, since this will also apply to empty responses, which we'd want cached for only 15 seconds.

2color commented 10 months ago

For us to intelligently set the cache-control (depending on wether the response is empty or not), we'd need to either:

More info here https://github.com/ipfs-shipyard/someguy/issues/26

2color commented 9 months ago

trustless-gateway.link still doesn't work when passing the format GET parameter. It only works when passing the Accept header.

This has two implications:

Example:

curl -i "https://trustless-gateway.link/ipfs/bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4?format=raw"
HTTP/2 400
server: openresty
date: Wed, 31 Jan 2024 14:44:49 GMT
content-type: text/html
content-length: 154
x-ipfs-datasize: 154
x-ipfs-lb-pop: gateway-bank2-fr2
x-bfid: 117b28ca94345718171f0e92c856146d
strict-transport-security: max-age=31536000; includeSubDomains; preload

<html>
<head><title>400 Bad Request</title></head>
<body>
<center><h1>400 Bad Request</h1></center>
<hr><center>openresty</center>
</body>
</html>

I think it'd be quite important to add support passing the format GET parameter

lidel commented 9 months ago

We should move filtering from Nginx to rainbow. Proposed solution in https://github.com/ipfs/rainbow/issues/71 and https://github.com/ipshipyard/waterworks-infra/issues/9