joomla / joomla-cms

Home of the Joomla! Content Management System
https://www.joomla.org
GNU General Public License v2.0
4.77k stars 3.65k forks source link

Joomla sending incorrect HTTP/HTTPS header #19292

Open severdia opened 6 years ago

severdia commented 6 years ago

Steps to reproduce the issue

I'm using a feed caching service for my Joomla RSS feed URLs and I discovered that they keep refreshing the feed because they think there's new content when there isn't. It's caused by the Last-Modified value in the header being reset every time the Joomla cache expires/is refreshed. The Last Modified value should be the date/time of the latest item in the feed, not the date/time the feed was cached.

My feed URL is here:

https://www.playshakespeare.com/news?format=feed&type=rss

If you check the header ( http://www.webconfs.com/http-header-check.php), it's:

Last-Modified => Thu, 04 Jan 2018 16:22:09 GMT

It will be different when you check it because it will probably have recached by then. But the latest item in the feed itself is:

<pubDate>Wed, 01 Feb 2017 00:00:00 -0500</pubDate>

So the Last-Modified value should be the same. There's a <lastBuildDate> in the feed which is correct and reflects the latest cached version of the feed:

<lastBuildDate>Thu, 04 Jan 2018 11:10:55 -0500</lastBuildDate>

Expected result

The Last-Modified header should be the date/time of the latest item in the feed. not the cache date/time.

Actual result

The Last-Modified header is currently showing (incorrectly) the date/time the feed file was created/cached.

System information (as much as possible)

Joomla 3.8.3, PHP 7.0.25, MySQL 5.6.23

Additional comments

mbabker commented 6 years ago

The <lastBuildDate> element is wrong too, after https://github.com/joomla/joomla-cms/issues/13008 things can be fixed in 4.0 due to the need for a B/C break with a public class property.

This still doesn't change the fact that nothing actually sets the build date or modified header correctly, which as noted in the linked issue is a bigger code change because it requires looping over all entries going into the feed to determine the correct date to set (versus using whatever time "now" is as is the current case).

brianteeman commented 6 years ago

Related issue #17192

severdia commented 6 years ago

@mbabker Doesn't it just require checking the first <item> node for the date/time and using that? The feed wouldn't be any newer than its latest item.

mbabker commented 6 years ago

This assumes every component view class pushes data into the JDocument object in a dated order. The reality is components should be setting these attributes themselves and not looking for magic behavior in the framework to do it, but because every non-HTML based aspect of Joomla is treated as a second class citizen the odds of things being fixed correctly are slim to none.

severdia commented 6 years ago

@mbabker Since it appears this may never be resolved in the way you describe, is it worth a simple patch just for RSS to fix the header in the meantime?

brianteeman commented 6 years ago

Its not just the rss feed that shows the current date as the last modified date. The same is true of the http headers

See https://itoctopus.com/how-to-change-the-last-modified-http-header-in-joomla-free-plugin and https://httpdot.org/howto/44-adding-last-modified-header-and-meta-tags-to-joomla-template

severdia commented 6 years ago

So if this issue, according to those sites, is solved with the code provided, is there any reason this hasn't been fixed on the core yet? Does that code impact anything else? It seems pretty straightforward so is a PR all that's missing here?

mbabker commented 6 years ago

It would need to be a PR updating all the frontend component view classes to set the right data in the Document instance or headers in the app (since we should do it all and not just fix com_content). Just needs someone to do it.

zero-24 commented 6 years ago

It would need to be a PR updating all the frontend component view classes to set the right data in the Document instance or headers in the app (since we should do it all and not just fix com_content). Just needs someone to do it.

hmm just to be sure. What do you mean by "all the frontend view classes"?

Isn't it just this files:

./components/com_contact/views/category/view.feed.php
./components/com_content/views/category/view.feed.php
./components/com_content/views/featured/view.feed.php
./components/com_finder/views/search/view.feed.php
./components/com_tags/views/tag/view.feed.php
./components/com_tags/views/tags/view.feed.php

Or I'm getting it wrong?

mbabker commented 6 years ago

If you aim to just fix the feed classes then sure. But shouldn't HTML documents be doing the same?

zero-24 commented 6 years ago

OK i have just found the code we need:

        // Pass the modified date to the document
        if ($this->item->modified)
        {
            $this->document->setModifiedDate($this->item->modified);
        }

But there is a problem. What is the expected behavior for multible items per page? Like the feed or the category list etc.

The code above also obviously just affects the header if the content is cachable at all.

severdia commented 6 years ago

I thunk it should be the date/time of the latest article publish/modification date/time.

brianteeman commented 6 years ago

I am deleting your advert @AdvanceShop

brianteeman commented 6 years ago

@zero-24 did you make any progress with this?

zero-24 commented 6 years ago

No. Feel free to take over if you wish. I currently don't have the time to dig deeper into this. Sorry