getgrav / grav-plugin-feed

Grav Feed Plugin
https://getgrav.org
MIT License
16 stars 11 forks source link

RSS feed encoding issue #70

Closed petira closed 1 year ago

petira commented 1 year ago

RSS feed currently does not correctly transmit characters encoded in UTF-8. Have you made any changes lately?

rhukster commented 1 year ago

not that i can see in the changelog. did you confirm it’s a grav issue? if so which version started the issue? also please provide a sample we can test with.

petira commented 1 year ago

@rhukster Transfer this issue to https://github.com/getgrav/grav-plugin-feed, please.

No changes on my side (hosting, PHP, etc.).

Now I use Grav v1.7.35 and Feed v1.8.5. I think it worked in version v1.7.34 and older.

This is the same problem as https://github.com/getgrav/grav-plugin-feed/issues/53.

rhukster commented 1 year ago

So to confirm @petira just Atom feed for you? RSS is fine?

rhukster commented 1 year ago

So i can recreate the problem. Ie some cyrillic text: Моя мама любит музыку in a title, and that is showing garbled in RSS feed and atom feed. However, I am still not sure why. It appears to have the correct encoding, and it works fine in the page content itself. I'm not sure this ever worked? I went back and tried an older version of Grav, and still didn't work. Also the feed plugin has not changed in some time.

Are you sure this ever worked guys? or you just noticed it?

rhukster commented 1 year ago

I found a solution, needed to set the charset of UTF-8 in the Content-Type header. Unfortunately, there's no way to do that currently in Grav, as the headers are set after the last useful event. To solve this I had to an event o Grav: onPageHeaders() that passes the page headers as an object. This then allows the feed plugin to use this even to append the ; charset=utf-8, which then resolves the issue.

This will require this commit in Grav to work: https://github.com/getgrav/grav/commit/de642df06e7e34508bce35f4dd96c7aa34a1d8d2

And this is the commit in feed: https://github.com/getgrav/grav-plugin-feed/commit/bff5a38d24722272cc06b0f7c248786a04c22d59

I think this was not such a huge problem, as while it looks corrupted in the source, if you save it, the encoding is correct, so it probably worked fine in RSS clients.

petira commented 1 year ago

@rhukster Thanks for the solution, I will try it once there is a new version.

It is rather strange that it has worked without problems so far. I have been using Feed since the site started running, specifically since December 2020 and sure enough the feeds worked until July 2022, maybe August 2022. It certainly worked even after issue https://github.com/getgrav/grav-plugin-feed/issues/61#issuecomment-849904065 was resolved. I discovered the bug last week.

Here is a link to the feed: https://www.grav.cz/blog.rss

Here is a link to the output: http://www.petira.net/

petira commented 1 year ago

@rhukster After updating to Grav v1.7.37(.1) and Feed v1.9.0 everything works now. Thanks.

@lazybadger Does this solve your problem https://github.com/getgrav/grav-plugin-feed/issues/53 too?

lazybadger commented 1 year ago

@petira - Yes, I can confirm my issue also fixed (due the same origin) and can be closed as "Resolved"