jakartaee / jakartablogs.ee

Welcome to the blog home of open source, cloud native Java innovation! Read posts from our global community on Jakarta EE news, technologies, best practices, and compatible products.
https://jakartablogs.ee
Eclipse Public License 2.0
11 stars 20 forks source link

Improper Aquarium entries in JakartaBlogs aggregator #104

Closed edbratt closed 2 years ago

edbratt commented 2 years ago

Describe the bug The newly introduced Blog aggregator is pulling in very old Aquarium posts and incorrectly attributing them and attaching incorrect meta data (i.e. very old posts seem to be showing up as current/new). This is confusing at best.

To Reproduce Steps to reproduce the behavior: Go to Jakarta Blogs Look for first post attributed to Will Lyons -- (in my view, it says "Remembering Felipe Gaucho." This post was originally written by Eduardo Pelligeri-Lliopart in 2010. Eduardo no longer works at Oracle (not that I think he'd mind the legacy reference). The Aquarium blog is a community blog and cannot be solely attributed to Will. The date this entry was made does not appear to be correctly being picked up. The blog entry itself says it was written by 'guest author' The second entry, Also attributed to Will was made in Nov. of 2010, GlassFish in Espanol and is about Multi-language release of GlassFish in Nov of 2010. The entry also says it was written by a Guest Author.

Some suggestions If we want this to be 'Will Lyons' blog, consider filtering only for Will Lyons as the author. His most recent post to Aquarium was June 23, 2020 'Jakarta EE 9 Milestone Release' I do see some additional posts by alternate authors, but Will is the most recent author. In any case, Aquarium is not a single author blog. The post author must be included as part of the aggregation data.

Expected behavior The posts, in the Aquarium view seem to be showing in correct time-order, here.

Desktop (please complete the following information):

chrisguindon commented 2 years ago

Thanks for creating an issue about this. I did notice a problem last week.

Every post on jakartablogs.ee was from Will last week. In order to fix this, I made an update to our blog aggregator to limit posts pr author: https://github.com/jakartaee/jakartablogs.ee/commit/98501ddd020c5513cc9911bf1d05b36ccc64d8cb

I took a closer look and Will's RSS feed and it appears to be invalid. The publishDate is not a valid element. It should be pubDate: https://validator.w3.org/feed/docs/rss2.html

All the posts are set to 01 Dec 2021 since the pubDate of the RSS itself is set to Wed, 01 Dec 2021 18:32:56 +0000. The aggregator is using that date since each item is missing the pubDate element

If we want this to be 'Will Lyons' blog, consider filtering only for Will Lyons as the author. His most recent post to Aquarium was June 23, 2020 'Jakarta EE 9 Milestone Release'

The RSS feed is the responsibility of the author. There is not much I can do about this. If a feed is problematic, we should probably remove it. However, before we do so, let's check with Will if he can fix the issue from his side.

@will-lyons is it possible for you to provide a valid RSS feed in order to address the pubDate issue?

chrisguindon commented 2 years ago

Actually, the date is now Dec 6. However, last week the date was set to Dec 01.

chrisguindon commented 2 years ago

@edbratt To address your second concern about every blog post being attributed to Will. We can change the author's name and picture to something more accurate.

If you provide both, I can make the change in this file: https://github.com/jakartaee/jakartablogs.ee/blob/master/planet/planet.ini#L146

edbratt commented 2 years ago

@will-lyons may have a different answer, but I doubt he can do anything other than pass the issue you've raised with the feed, back to our Blog dev. team, which I will do. As I said, we are just content authors. This system isn't personally managed, nor operated by either of us. The reason the 'guest author' attribution is now used is because, in this case anyway, the author is no longer with Oracle so the photo(s) aren't available. Given the date seems to keep changing and our most recent blog posts to Aquarium are from last year, why don't you just disable the aggregation from this blog, until we can get someone here at Oracle to attend to this problem. This is not a single author blog and it never will be so the aggregation should not presume that any one person is going to author everything on Aquarium.

Can you provide snippets from the RSS feed that are incorrect? I will pass those back to our dev. group and see if they can address the problem here.

edbratt commented 2 years ago

Our blog support team intends to correct the date field name in an update Friday 12/17 (evening, PST). I'll submit a PR to adjust the blog name and image shortly but hopefully, the feed update will solve the issues with very old blog content showing up as recent from the Aquarium blog.

chrisguindon commented 2 years ago

This is great news! Thanks @edbratt for looking into this!

edbratt commented 2 years ago

Please see https://github.com/jakartaee/jakartablogs.ee/pull/105 for attribution update.

chrisguindon commented 2 years ago

Closing this issue! I think we are done here!