aliostad / CacheCow

An implementation of HTTP Caching in .NET Core and 4.5.2+ for both the client and the server
MIT License
847 stars 172 forks source link

no-cache problem in Net Core application #255

Closed balsamiqLuca closed 4 years ago

balsamiqLuca commented 4 years ago

Hi,

I need to use caching with Etag and I noticed that the check on the no-cache header is performed after the check on the Expires header, so it always returns not-cacheable. When the strategy is no-cache, it should not be required that the Expires header is set, so maybe the check on the no-cache should go before the check on the Expires header:

https://github.com/aliostad/CacheCow/blob/b2b2e2b988d079e5e9305d7ac4d1f7c1908c6b7f/src/CacheCow.Client/CachingHandler.cs#L94

This is an example of a response that does not get cached:

HTTP/1.1 200 OK
Date: Mon, 24 Aug 2020 09:02:57 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 159715
Connection: keep-alive
Set-Cookie: AWSALB=MfAWemuHVk7WV4gahD7pZR4lRzE1cqbjiNmFtMiUK00d0c62aLuTYU2VmvKUsp6+aJ4+fmucz7rRAhj+fDPc9lmPcvWRfF+8xYjzJoStKOu5cpBZr/H60yCYm936; Expires=Mon, 31 Aug 2020 09:02:57 GMT; Path=/
Set-Cookie: AWSALBCORS=MfAWemuHVk7WV4gahD7pZR4lRzE1cqbjiNmFtMiUK00d0c62aLuTYU2VmvKUsp6+aJ4+fmucz7rRAhj+fDPc9lmPcvWRfF+8xYjzJoStKOu5cpBZr/H60yCYm936; Expires=Mon, 31 Aug 2020 09:02:57 GMT; Path=/; SameSite=None; Secure
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: x-balsamiq-archive-revision
Strict-Transport-Security: max-age=31536000; includeSubDomains
ETag: "19eb1c071d7e8166c785780f123c775d05aaf2cb"
X-Balsamiq-Archive-Revision: 770
Cache-Control: no-cache, private, must-revalidate
Vary: Accept-Encoding
aliostad commented 4 years ago

Hi sorry, was on holiday, just back. Will have a look.

aliostad commented 4 years ago

It does not say it cannot be cached, it says it must be re-validated. Normally these would be cached but they would still be validated every time. With no-cache (and especially must-revalidate), expiry does not matter. So what is the CacheCow header you get back?

balsamiqLuca commented 4 years ago

Hi, thanks for the answer! The CacheCow headers I get back say: "not-cacheable=true;did-not-exist=true", due to the fact (I think) that in the CachingHandler ResponseValidator function it's checking the Expires before the "no-cache":

       if (response.Headers.CacheControl.MaxAge == null &&
                    response.Headers.CacheControl.SharedMaxAge == null &&
                    response.Content.Headers.Expires == null)
                    return ResponseValidationResult.NotCacheable;

                if (response.Headers.CacheControl.NoCache)
                    return ResponseValidationResult.MustRevalidate;
aliostad commented 4 years ago

I think the "NotCacheable" is misleading. Can you check on subsequent requests whether it validates it?

balsamiqLuca commented 4 years ago

It keeps saying: not-cacheable=true,did-not-exist=true

aliostad commented 4 years ago

Well, if a response has Set-Cookie header, it is not cacheable. I was not sure if I had implemented it but just double checked and I have. If you remove Set-Cookie headers it should work.

https://github.com/aliostad/CacheCow/blob/master/src/CacheCow.Server/Cacheability/DefaultCacheabilityValidator.cs#L89

Let me know if you need have more questions, otherwise will be closing the issue.

balsamiqLuca commented 4 years ago

Hi, we removed the Set-Cookie from the response but it still says not-cacheable. These are the new response headers:

HTTP/1.1 200 OK
Date: Tue, 25 Aug 2020 08:47:29 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 387855
Connection: keep-alive
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: x-balsamiq-archive-revision
Strict-Transport-Security: max-age=31536000; includeSubDomains
ETag: "b4fbbd330bf15d643e06eafd6e4bf570d8d899d3"
X-Balsamiq-Archive-Revision: 770
Cache-Control: no-cache, private, must-revalidate
Vary: Accept-Encoding
aliostad commented 4 years ago

OK, the issue obviously is that since it does not know how long it should cache it, it deems it as non-cacheable. This is technically not correct: I think ideally in case of a no-cahe, it can simply assume the response as stale (or having max-age=0 or Expires=-1, etc).

Having said that, it is an edge case and I believe a cacheable resource should have an age or expires header. Frankly, if a resource does not have expiry or age, the sever is not serious about cacheablity of the resource.

So I prefer to keep the behaviour the same since there could be other users of the library impacted if I return ResponseValidationResult.MustRevalidate: items that they are not really meant for caching to be validated which is extra work.

As for your case, you have an easy solution. I have made the property public for the same reason so can be overriden:

myhandler.ResponseValidator = (response) =>
            {
                if (!response.StatusCode.IsIn(_cacheableStatuses))
                    return ResponseValidationResult.NotCacheable;

                if (!response.IsSuccessStatusCode || response.Headers.CacheControl == null ||
                    response.Headers.CacheControl.NoStore) //  || response.Headers.CacheControl.NoCache was removed. See issue
                    return ResponseValidationResult.NotCacheable;

                if (response.Headers.Date == null)
                    TraceWriter.WriteLine("Response date is NULL", TraceLevel.Warning);

                response.Headers.Date = response.Headers.Date ?? DateTimeOffset.UtcNow; // this also helps in cache creation
                var dateTimeOffset = response.Headers.Date;

                TraceWriter.WriteLine(
                    String.Format("CachedResponse date was => {0} - compared to UTC.Now => {1}", dateTimeOffset, DateTimeOffset.UtcNow), TraceLevel.Verbose);

                if (response.Content == null)
                    return ResponseValidationResult.NotCacheable;

                if (response.Headers.CacheControl.NoCache)
                    return ResponseValidationResult.MustRevalidate;

                if (response.Headers.CacheControl.MaxAge == null &&
                    response.Headers.CacheControl.SharedMaxAge == null &&
                    response.Content.Headers.Expires == null)
                    return ResponseValidationResult.MustRevalidate;

                // here we use
                if (response.Content.Headers.Expires != null &&
                    response.Content.Headers.Expires < DateTimeOffset.UtcNow)
                    return response.Headers.CacheControl.ShouldRevalidate(MustRevalidateByDefault)
                        ? ResponseValidationResult.MustRevalidate : ResponseValidationResult.Stale;

                if (response.Headers.CacheControl.MaxAge != null &&
                    DateTimeOffset.UtcNow > response.Headers.Date.Value.Add(response.Headers.CacheControl.MaxAge.Value))
                    return response.Headers.CacheControl.ShouldRevalidate(MustRevalidateByDefault)
                        ? ResponseValidationResult.MustRevalidate : ResponseValidationResult.Stale;

                if (response.Headers.CacheControl.SharedMaxAge != null &&
                    DateTimeOffset.UtcNow > response.Headers.Date.Value.Add(response.Headers.CacheControl.SharedMaxAge.Value))
                    return response.Headers.CacheControl.ShouldRevalidate(MustRevalidateByDefault)
                        ? ResponseValidationResult.MustRevalidate : ResponseValidationResult.Stale;

                return ResponseValidationResult.OK;
            };

So create your handler and then change the property to above where it treats your case as MustRevalidate. In fact I brought no-cache further up since it is more telling in your case.

Hope this helps.

balsamiqLuca commented 4 years ago

Hi! Ok, we'll override the function as you said. Thanks for your support!

aliostad commented 4 years ago

Anytime! Thanks for using CacheCow.