Open chw-1 opened 7 years ago
@chw-1 can you please look at latest staging if this Issue is still valid?
@franz-wohlkoenig I installed version 3.8.2 with sample data and enabled the caching plugin.
The ETag header is now a hexadecimal string. During my tests I observed the following behavior:
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0, no-cache
...
ETag: e272f1ada1f3e78b4deceedb7e37d90e\r\n
The client should never cache the result and my browser correctly handles this.
However if I correctly understand the spec, it requires that the ETag header changes its value if the resource changes. In my tests, it always remains the same, even if the content of the RSS file changes after I add a new article to my page and clear the cache. So Joomla still sends an header in its HTTP response that does not respect the spec. Furthermore the string must be quoted.
If the Cache-Control header would instruct the client to cache the response, clients using ETag for caching would fail as the ETag is a constant and does not change when content gets updated.
At the time I opened the issue, we observed this behavior with several Outlook clients. I ignore the reason why they did cache the RSS file, but every new request contained the If-None-Match
header with the constant ETag value. And the server replied with a 304 Not Modified
status, although the content has been updated in the meanwhile. So the Outlook clients never updated their RSS file until we manually deleted their cache. If the server had correctly computed the ETag value, the ETag would not match and the server would have send the new RSS file to the client.
As most clients correctly handle this due to the Cache-Control header, this issue might not have a big impact in daily business. But I think as long as the ETag value is computed from the URL and not from the content of the RSS file, this issue is still valid. This might also prevent client caching of RSS files as currently there is always a Cache-Control no-store, no-cache, ...
header in the response.
any Comment by Release Lead @wilsonge?
Reminder for @wilsonge
Seems bad. It's also my understanding of etag. If content versioning is enabled then it's easy to append the version id to the etag hash. unfortunately i'm not sure if there's a good way to solve this when versioning is disabled without basically just regenerating the etag every page load to assume the content has changed in some way. Also tagging @mbabker as there's no reason if we get a fix this can't be fixed in the 3.8 series
https://github.com/joomla/joomla-cms/pull/19591 will quote the value of the ETag
header.
As for using a value that changes when the page contents change, there needs to be a change in how the cache key is built to make that work, and it has to be usable at the onAfterInitialise
system event (so even before the request has been routed). We can't base the cache key on a hash of the rendered contents because we would have to render the contents to get the item out of the cache, and at that point it's too late for the page cache to be of any use.
Solution is use ETag as modified time of cache file.
protected function lastModified($id, $group) { $app = Factory::getApplication(); $filename = 'administrator/cache/' . $group. '/' .md5($app->get('secret')) . '-cache-' . $group . '-' . md5(md5(JPATH_CONFIGURATION) . '-' . $id . '-' . $app->getLanguage()->get('tag')). '.php'; if (file_exists($filename)) { date_default_timezone_set('your server timezone...'); return date("F d Y H:i:s", filemtime($filename)); } else { return 'file_not_exists'; } }
public function get($id = false, $group = 'page')
{
// If an id is not given, generate it from the request
if (!$id)
{
$id = $this->_makeId();
}
$lastModified = md5($this->lastModified($id, $group));
//$lastModifiedGzip = $lastModified.'-gzip';
//var_dump($lastModifiedGzip);
// If the etag matches the page id ... set a no change header and exit : utilize browser cache
if (!headers_sent() && isset($_SERVER['HTTP_IF_NONE_MATCH']))
{
$etag = stripslashes($_SERVER['HTTP_IF_NONE_MATCH']);
$etag = str_replace('"', '', $etag);
//var_dump($etag);
if ($etag == $lastModified)
{
/*$browserCache = $this->options['browsercache'] ?? false;
if ($browserCache)
{
$this->_noChange();
}
}
}
// We got a cache hit... set the etag header and echo the page data
$data = $this->cache->get($id, $group);
$this->_locktest = (object) array('locked' => null, 'locklooped' => null);
if ($data === false)
{
$this->_locktest = $this->cache->lock($id, $group);
// If locklooped is true try to get the cached data again; it could exist now.
if ($this->_locktest->locked === true && $this->_locktest->locklooped === true)
{
$data = $this->cache->get($id, $group);
}
}
if ($data !== false)
{
/*if ($this->_locktest->locked === true)
{*/
$this->cache->unlock($id, $group);
/*}*/
$data = unserialize(trim($data));
$data = Cache::getWorkarounds($data);
$this->_setEtag($lastModified);
return $data;
}
// Set ID and group placeholders
$this->_id = $id;
$this->_group = $group;
return false;
}
Steps to reproduce the issue
Expected result
Quote from RFC 7232:
Actual result
Wireshark displays the following Etag header :
It is the same for each version of the RSS feed, actually it is always the URL. Furthermore, it is not quoted as required by the spec.
System information
Occurs on Apache and IIS 8.5. Joomla version is 3.7.3.
Behavior observed in all major browsers, Internet Explorer, Firefox and Chrome, and with Outlook RSS feed reader.
Additional comments
If a client sends the Etag back using the
If-None-Match
HTTP header, we always get a 304 HTTP status back, regardless of the age of the resource.It seems that also other persons had the same issue: "Use Browser Caching" gives a never changing etag
In class
PlgSystemCache
, we can indeed observe that the url is used as key for cache lookup. Line 43:Line 73:
In class
JCacheControllerPage
, the cache instance used inPlgSystemCache
, this url is used to compute the Etag HTTP header. Line 103: