mediacloud / rss-fetcher

Intelligently fetch lists of URLs from a large collection of RSS Feeds as part of the Media Cloud Directory.
https://search.mediacloud.org/directory
Apache License 2.0
5 stars 5 forks source link

plan out API for front-end #7

Closed rahulbot closed 7 months ago

rahulbot commented 2 years ago

In our future model, there is some information the front end "Collections management" web tool will need to community to and from the rss-fetcher. So far, these are the two endpoints we've discussed:

API Endpoints for the Front-End to Call

The front-end has a few things it needs from the RSS Fetcher for the Collections management web tool. These could probably be implemented with FastAPI within the current architecture, which has been serving us well on other projects.

Fetch feeds now

Sometimes our researchers just want to force a source to update to the latest. This happens for various reasons. So the Media Source webpage will have a "fetch feeds now" button that should trigger a call to this endpoint, perhaps will a list of the RSS feeds for the source (because the RSS Fetcher doesn't have a robust concept of media).

URL: POST /api/feeds/fetch_now Params: required list of feed_ids sent over as an array Return: if not too long of a timeout, send back the synchronous results of fetching each feed as an array of fetch_events

Feed History

Our researchers often want to interrogate how a media's feeds have been performing lately. This will probably manifest as a button on the Media Source webpage that says something like "see feed history". Clicking it would call this endpoint and display results.

URL: GET api/feeds/[feed_id]/history?days=30 Params: the feed_id in the URL, and an optional history length in days encoded as a URL param Return: thesystem_enabledboolean indicating if the RSS Fetcher is still trying to fetch this feed, and also an array offetch_events` that fall within the date window specified on the URL request

Synchronizing Feeds

A related note is how to update the big list of feeds from the front end. We think this could happen each night via a cron job in the RSS Fetcher. That script would connect to the front-end DB to pull any feeds created after the last run, and any with a modified date after the last run. That would yield a list of insert and update operations to run on the RSS Fetcher DB.

philbudne commented 7 months ago

@rahulbot @Evan-Leon are there any outstanding issues/needs for the rss-fetcher API, or can this be closed?

Looks like the last change to rss-fetcher API was in February, deployed in rss-fetcher v0.12.15

rahulbot commented 7 months ago

Nothing needed from my perspective.

Evan-Leon commented 7 months ago

Good on my end as well, thanks Phil!