GoogleChrome / webdev-infra

Apache License 2.0
37 stars 28 forks source link

Implement utility to fetch RSS feeds which can be used in external data jobs #71

Closed matthiasrohmer closed 1 year ago

matthiasrohmer commented 1 year ago

Both web.dev and d.c.c fetch external data that is displayed on the sites. See the jobs for d.c.c for reference: https://github.com/GoogleChrome/developer.chrome.com/tree/main/external

For the updated author pages we want to extend this flow to fetch RSS feeds from a configurable list of arbitrary sources. For example:

But another example are the d.c.c author RSS feeds, which for example should also be surfaced on web.dev. See https://github.com/GoogleChrome/developer.chrome.com/issues/5859.

Those posts need to be fetched and normalized into a JSON format that we can store and then later mix into the stream of posts we show on author pages like https://web.dev/authors/paulkinlan/.

The utility ideally exports a single method that takes a path to the authorsData.json file, which is src/site/_data/authorsData.json for web.dev and site/_data/authorsData.json for d.c.c, reads this file, loops over all authors and looks for a to-be-introduced external key like so:

  "paulkinlan": {
    "homepage": "https://paul.kinlan.me/",
    "twitter": "paul_kinlan",
    "mastodon": "https://status.kinlan.me/@paul",
    "image": "image/T4FyVKpzu4WKF1kBNvXepbi08t52/0O1ZGr2P0l9oTKabyUK5.jpeg",
    "external": [
       {
          "label": "Blog",
          "url": "https://paul.kinlan.me/index.xml",
       },
       {
           "url": "https://developer.chrome.com/authors/paulkinlan/feed.xml"
        }
    ]
  },

fetches the feeds from the sources in external, normalizes them and writes a JSON file containing all data needed for the updated author page layout to a specified location. The idea is to have a utility that can then be easily used for an external data job in the web.dev and d.c.c repos, just passing in the path instead of duplicating the logic to both sites.

Before implementation starts, please write a plan for the implementation in a design doc.

mamieorine commented 1 year ago

The implementation plan has been approved by Matthias. The development following the plan is in progress.

mamieorine commented 1 year ago

We need more examples of RSS sources to implement the method to deal with feeds from arbitrary sources. Matthias sent the email to the team to ask for their personal presence, such as their Youtube channel or their website. Thus, this ticket has been paused for now until we have more examples of the author's blogs.