mizosoft / methanol

⚗️ Lightweight HTTP extensions for Java
https://mizosoft.github.io/methanol
MIT License
241 stars 12 forks source link

Couldn't parse HTTP date #64

Closed Kukis13 closed 9 months ago

Kukis13 commented 11 months ago

Hi, I use your library exclusively because I need to cache (ideally forever) my HTTP requests. However anytime I hit it any reading from cache ends up with:

Dec 23, 2023 8:27:37 AM com.github.mizosoft.methanol.internal.cache.DateUtils toHttpDate0
WARNING: couldn't parse HTTP date: -1
java.time.format.DateTimeParseException: Text '-1' could not be parsed at index 2
    at java.base/java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:2108)
    at java.base/java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1936)
    at com.github.mizosoft.methanol.internal.cache.DateUtils.toHttpDate0(DateUtils.java:55)
    at com.github.mizosoft.methanol.internal.cache.DateUtils.toHttpDate(DateUtils.java:45)
    at java.base/java.util.Optional.map(Optional.java:260)
    at com.github.mizosoft.methanol.internal.cache.FreshnessPolicy.<init>(FreshnessPolicy.java:47)
    at com.github.mizosoft.methanol.internal.cache.CacheResponse$CacheStrategy.<init>(CacheResponse.java:110)
    at com.github.mizosoft.methanol.internal.cache.CacheResponse.<init>(CacheResponse.java:42)
    at com.github.mizosoft.methanol.HttpCache$InternalCacheView.get(HttpCache.java:344)
    at com.github.mizosoft.methanol.internal.cache.CacheInterceptor$AsyncAdapter.lambda$get$1(CacheInterceptor.java:366)
    at com.github.mizosoft.methanol.internal.function.ThrowingSupplier.lambda$toUnchecked$0(ThrowingSupplier.java:14)
    at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
    at java.base/java.util.concurrent.CompletableFuture.asyncSupplyStage(CompletableFuture.java:1782)
    at java.base/java.util.concurrent.CompletableFuture.supplyAsync(CompletableFuture.java:2005)
    at com.github.mizosoft.methanol.internal.function.Unchecked.supplyAsync(Unchecked.java:17)
    at com.github.mizosoft.methanol.internal.cache.CacheInterceptor$AsyncAdapter.get(CacheInterceptor.java:366)
    at com.github.mizosoft.methanol.internal.cache.CacheInterceptor.getCacheResponse(CacheInterceptor.java:142)
    at com.github.mizosoft.methanol.internal.cache.CacheInterceptor.exchange(CacheInterceptor.java:117)
    at com.github.mizosoft.methanol.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:98)
    at com.github.mizosoft.methanol.Methanol$InterceptorChain.forward(Methanol.java:833)
    at com.github.mizosoft.methanol.internal.cache.RedirectingInterceptor.intercept(RedirectingInterceptor.java:95)
    at com.github.mizosoft.methanol.Methanol$InterceptorChain.forward(Methanol.java:833)
    at com.github.mizosoft.methanol.Methanol$AutoDecompressingInterceptor.intercept(Methanol.java:937)
    at com.github.mizosoft.methanol.Methanol$InterceptorChain.forward(Methanol.java:833)
    at com.github.mizosoft.methanol.Methanol$RequestRewritingInterceptor.intercept(Methanol.java:881)
    at com.github.mizosoft.methanol.Methanol$InterceptorChain.forward(Methanol.java:833)
    at com.github.mizosoft.methanol.Methanol.send(Methanol.java:316)

Why does it happen? And when it happens does it mean that the request was successfully read from cache or not?

mizosoft commented 10 months ago

Hey @Kukis13, thanks for raising the issue. Apologies for the late reply.

You'll see this log if the value of a header that's needed by the cache is not a valid HTTP date when it's expected to be so. HTTP dates can come form different places. The stack trace shows that the invalid date comes from the Expires header. It seems that some servers set the value of this header to -1, 0, or any invalid date when they want the cache to regard the response as already expired, which means they don't want caches to serve them without contacting the server. This is already covered by the spec, section 5.3:

   A cache recipient MUST interpret invalid date formats, especially the
   value "0", as representing a time in the past (i.e., "already
   expired").

Whether the response was saved into cache or not can only be determined by seeing the values of other headers. For instance, the value of the max-age directive overrides whatever set by Expires, as per the spec (same section):

   If a response includes a Cache-Control field with the max-age
   directive (Section 5.2.2.8), a recipient MUST ignore the Expires
   field.  Likewise, if a response includes the s-maxage directive
   ([Section 5.2.2.9), a shared cache recipient MUST ignore the Expires
   field.  In both these cases, the value in Expires is only intended
   for recipients that have not yet implemented the Cache-Control field.

BTW, in order to know programmatically whether the response was saved/read into/from cache, you can use an HttpCache.Listener like so:

var cache = HttpCache.newBuilder()
        .cacheOnMemory(8 * 1024)
        .listener(new HttpCache.Listener() {
            @Override
            public void onWriteSuccess(HttpRequest request) {
                // Called when the response body is successfully written to cache.
                System.out.println("Written to cache: " + request.uri());
            }

            @Override
            public void onResponse(HttpRequest request, CacheAwareResponse<?> response) {
                // Called when the response is served. If the status is HIT or CONDITIONAL_HIT then it's been served form cache.
                System.out.println(request.uri() + " " + response.cacheStatus());
            }
        })
        .build();

Having said that, it seems that currently Methanol doesn't properly conform to the spec when the Expires header is invalid, because it assumes that it's just not present in that case, which is not the same thing. I will keep this issue open till this is fixed.