rometools / rome

Java library for RSS and Atom feeds
https://rometools.github.io/rome
Apache License 2.0
902 stars 168 forks source link

SyndFeed for RSS Feed : getPublishedDate() returns null if <pubDate> contains GMT #539

Closed k-wilmeth closed 1 year ago

k-wilmeth commented 2 years ago

This might be me just not understanding your code base completely but I am having an issue with the .getPubllishedDate() method which attempts to retrieve the value of pubDate from an RSS feed(not able to modify the feed). It returns null if the pubDate field from the feed contains the GMT timezone.

I have gone through past issues where others have had this same issue. However, adding a custom mask, and trying to recreate previous workarounds are not working.

See these examples that come straight from a feed that reads the value correctly, and another that does not.

getPublishedDate() works as expected: Sat, 19 Mar 2022 22:45:00 +0000 getPublishedDate() returns null: Tue, 01 Mar 2022 16:03:00 GMT+0000

Any guidance you may have on this matter would be greatly appreciated,

Thank you!

antoniosanct commented 2 years ago

Hi, @KaipoWilmeth:

According to SimpleDateFormat javadoc, For parsing, "Z" is parsed as the UTC time zone designator. General time zones are not accepted. In this case, "GMT+0000" is considered a general time zone because it mixes both terms. Theorically, a valid RFC 822 DateTime time zone would be "+0000", or "GMT+00:00".

ROME used 'z' and 'Z' as default timezone mask. Then, you must add an extra mask in rome.properties, hardcoding the 'GMT' word like this: EEE, dd MMM yy HH:mm:ss 'GMT'z

Please, test in your project and check your results!

Regards, Antonio.

PatrickGotthard commented 1 year ago

I'll close this due to no response