Open ion1 opened 9 years ago
It might be useful to assume some reasonable non-zero max-age for content on IPNS. A minute? Fifteen seconds?
It could be determined by the IPNS record -- some will have a calculable TTL.
Until then, we could set must-revalidate
and a quasi infinite max-age
I’m not sure if I have understood this correctly, but I’m under the impression that must-revalidate
only applies if the content has already expired. http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4
You're right, thanks -- so that means must-revalidate
+ max-age=0
= cache-indefinitely-unless-updated
Note that for /ipfs/
, we can still set a quasi-infinite max-age.
It seems to me that must-revalidate
means that given an object that expired according to max-age
, a cache must return an error instead of a stale copy if it fails to connect to the origin server. I’m not sure must-revalidate
is of any use here. Unless I’m misinterpreting it, of course.
DNS lets you say explicitly for how long a record can be cached. Perhaps we should do that with IPNS names? ipfs name publish --ttl=3h
which would be taken advantage of by IPFS nodes who have resolved the name as well as by the gateway in the Cache-Control header.
It was mentioned on IRC that resolve
does not provide TTL information to be used for max-age
.
resolveOnce
could be changed to return a TTL value. resolve
could start at infinity-ish and compute the minimum of the previous TTL and what resolveOnce
returned at each iteration.
Given that the DNS lookup functions in the net package for Go do not seem to provide TTL information at the moment, the DNS resolveOnce
could return a reasonable guess like one minute. If a DNS record points to /ipns/<keyhash>
, that IPNS record can still decrement the TTL according to the iteration above if a low cache time is desired.
If resolve
returned a TTL, the if strings.HasPrefix(urlPath, ipfsPathPrefix)
check in gateway_handler.go
could go away and the header be set unconditionally.
Would this be a reasonable starting point?
diff --git a/namesys/interface.go b/namesys/interface.go
index 09c296c..b0b740d 100644
--- a/namesys/interface.go
+++ b/namesys/interface.go
@@ -47,6 +47,14 @@ const (
// trust resolution to eventually complete and can't put an upper
// limit on how many steps it will take.
UnlimitedDepth = 0
+
+ // ImmutableTTL is the time-to-live value to be reported for immutable
+ // objects.
+ ImmutableTTL = 10 * 365 * 24 * time.Hour
+
+ // UnknownTTL is the time-to-live value to be reported for mutable
+ // paths when accurate information is not available.
+ UnknownTTL = time.Minute
)
// ErrResolveFailed signals an error when attempting to resolve.
@@ -88,7 +96,10 @@ type Resolver interface {
// There is a default depth-limit to avoid infinite recursion. Most
// users will be fine with this default limit, but if you need to
// adjust the limit you can use ResolveN.
- Resolve(ctx context.Context, name string) (value path.Path, err error)
+ //
+ // Resolve also returns a time-to-live value which indicates the
+ // maximum amount of time the result may be cached.
+ Resolve(ctx context.Context, name string) (value path.Path, ttl time.Duration, err error)
// ResolveN performs a recursive lookup, returning the dereferenced
// path. The only difference from Resolve is that the depth limit
@@ -97,7 +108,7 @@ type Resolver interface {
//
// Most users should use Resolve, since the default limit works well
// in most real-world situations.
- ResolveN(ctx context.Context, name string, depth int) (value path.Path, err error)
+ ResolveN(ctx context.Context, name string, depth int) (value path.Path, ttl time.Duration, err error)
}
// Publisher is an object capable of publishing particular names.
diff --git a/namesys/base.go b/namesys/base.go
index e552fce..29ef997 100644
--- a/namesys/base.go
+++ b/namesys/base.go
@@ -2,6 +2,7 @@ package namesys
import (
"strings"
+ "time"
context "github.com/ipfs/go-ipfs/Godeps/_workspace/src/golang.org/x/net/context"
@@ -9,27 +10,36 @@ import (
)
type resolver interface {
- // resolveOnce looks up a name once (without recursion).
- resolveOnce(ctx context.Context, name string) (value path.Path, err error)
+ // resolveOnce looks up a name once (without recursion). It also
+ // returns a time-to-live value which indicates the maximum amount of
+ // time the result may be cached.
+ resolveOnce(ctx context.Context, name string) (value path.Path, ttl time.Duration, err error)
}
// resolve is a helper for implementing Resolver.ResolveN using resolveOnce.
-func resolve(ctx context.Context, r resolver, name string, depth int, prefixes ...string) (path.Path, error) {
+func resolve(ctx context.Context, r resolver, name string, depth int, prefixes ...string) (path.Path, time.Duration, error) {
+ // Start with a long TTL.
+ ttl := ImmutableTTL
+
for {
- p, err := r.resolveOnce(ctx, name)
+ p, ttlOnce, err := r.resolveOnce(ctx, name)
+ // Use the lowest TTL reported by the resolveOnce invocations.
+ if ttlOnce < ttl {
+ ttl = ttlOnce
+ }
if err != nil {
log.Warningf("Could not resolve %s", name)
- return "", err
+ return "", ttl, err
}
- log.Debugf("Resolved %s to %s", name, p.String())
+ log.Debugf("Resolved %s to %s (TTL %v -> %v)", name, p.String(), ttlOnce, ttl)
if strings.HasPrefix(p.String(), "/ipfs/") {
// we've bottomed out with an IPFS path
- return p, nil
+ return p, ttl, nil
}
if depth == 1 {
- return p, ErrResolveRecursion
+ return p, ttl, ErrResolveRecursion
}
matched := false
@@ -44,7 +54,7 @@ func resolve(ctx context.Context, r resolver, name string, depth int, prefixes .
}
if !matched {
- return p, nil
+ return p, ttl, nil
}
if depth > 1 {
@ion1 yeah, we could return a ttl with every resolution... I'm generally against more return types than 1+error, but this is likely a cleaner approach than others
<jbenet> […] i'm overall -1 on returning the TTL on every Resolve call. i think that should be a separate function call.
<jbenet> most uses of Resolve do not care.
<jbenet> and its important to keep the interface as clean as we can possible have it.
<ion> jbenet: Fair enough. So it would be better to add a new function to the Resolver interface?
<jbenet> ion: yeah a new function SGTM.
I’m going to do that then.
Just for the record, this is still an open issue for /ipns/
responses returned by go-ipfs v0.4.18.
Etag
is present, but we are missing an explicit IPNS expiration header.
AFAIK it could be either:
Cache-Control
header with max-age
Expires
header with a dateWhy?
There is a Last-Modified
header so most browsers will use "Heuristic Freshness" to determine how long to cache that asset for. This is a weak cache hint, as RFC7234 suggests something like (current time - last modified time) / 10
, however, there are subtle differences between Firefox and Chrome.
Due to this we should provide explicit cache expiration headers for /ipns/
to unify experience across all vendors.
This will sort-of depend on https://github.com/ipfs/go-ipfs/issues/5884, as ipns supports dns resolution too. 'Real' IPNS records support TTL too, but AFAIK it's currently not used (basically it's set to 0).
actionable task extracted from https://github.com/ipfs/go-ipfs/pull/8074#pullrequestreview-645196768, seems to be a natural continuation fo this old issue here
TLDR: we are not leveraging TTLs in Gateway responses, which means things are not cached for as long as they could be. This is getting critical for our gateway infra, without this we are unable to cache things on Gateways as efficiently as we could.
/ipns/*
should return Cache-Control
based on:
Last-Modified
header (right now returns now
timestamp to leverage browser heuristycs: https://github.com/ipfs/go-ipfs/pull/8074, but we can remove it only when proper Cache-Control
is in place)Etag
based on the content path from the record. Together with Cache-control
it enables cache at edge and user agents to revalidate cache instead of re-downloading data that did not change and which they already have. We already do this for files and dirs, just mentioning here for completeness.Path
(as suggested earlier) cache-control
header is returned /ipns/
content paths have proper cache controls on all gateway types: path, subdomain, and dnslinkcc https://github.com/ipfs/go-ipfs/issues/5884, https://github.com/ipfs/go-ipfs/issues/7564
Reopening. We have spec at https://specs.ipfs.tech/http-gateways/path-gateway/#cache-control-response-header, but boxo/gateway
does not implement it fully.
This is not fixed fully yet.
IPNS Records are handled correctly thanks to https://github.com/ipfs/boxo/pull/459 but we don't set Cache-Control
with max-age
based on TTL as noted in https://github.com/ipfs/boxo/pull/459#issuecomment-1705462860. I remember we did not have TTL there because Golang standard library is not exposing access to original TTL of DNS record when resolver from operating system is used.
Cache-Control
on /ipns/{dnslink}
responses. Either correctly wire up TTL from DNS TXT record of DNSLink, or use 5 minute as default fallback when it is not available.Cache-Control
implementation from https://github.com/ipfs/boxo/pull/584, where value is public, max-age=<ttl>, stale-while-revalidate=<dht-or-other-expiration-window>, stale-if-error=<dht-or-other-expiration-window>
cache-control
on DNSLink responseport Cache-Control implementation from https://github.com/ipfs/boxo/pull/584, where value is public, max-age=
, stale-while-revalidate= , stale-if-error=
For DNSLink, what would you recommend the stale-while-revalidate
be, since it has nothing to do with the DHT?
Good question. Unless there is a better value, I suggest using Amino DHT expiration window (48h) in cases like this, where we need a sensible hard ceiling on caching mutable things.
Rationale: DNSLink resolves to some CID. Returning a cached HTTP response body is ok as long the the data behind that CID could be (in theory) retrievable from the public swarm provider. So if we count from the moment we resolved dnslink and successfully retrieved it, 48h is the max window we can assume the retrieval of the same payload would be successful.
The gateway does not seem to provide a Cache-Control/max-age header or an ETag for
/ipns
URLs.It might be useful to assume some reasonable non-zero max-age for content on IPNS. A minute? Fifteen seconds?
It would also be good to serve an ETag corresponding to whatever IPFS hash the IPNS path resolved to.