w3stling / rssreader

A simple Java library for reading RSS and Atom feeds
MIT License
152 stars 25 forks source link

java.time.format.DateTimeParseException #103

Closed Moosheimer closed 1 year ago

Moosheimer commented 1 year ago

When reading the url https://www.nrdc.org/rss.xml I get the error: java.time.format.DateTimeParseException: Text '2023-08-07T10:06:05-0400' could not be parsed, unparsed text found at index 22

When reading the url https://www.sciencedaily.com/rss/top.xml I get the error: java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: Response http status code: 403

Both of them work fine with Thinderbird. Is this known or just my fate?

w3stling commented 1 year ago

For the second problem it seems like this particular web site returns http status code 403 if the user agent is not set to one used by the common web browsers.

Set user agent:

var list = new RssReader()
        .setUserAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36")
        .read("https://www.sciencedaily.com/rss/top.xml")
        .sorted()
        .collect(Collectors.toList());
Moosheimer commented 1 year ago

Wow. You are great! Thanks for the quick help and thanks for sharing the code with all of us. Will try the new version right away....

w3stling commented 1 year ago

The timestamp bug is fixed in 3.4.6 release.

Moosheimer commented 1 year ago

Just tried it out. Works perfectly. Thank you!

w3stling commented 1 year ago

Thanks for confirming