sneakers-the-rat / paper-feeds

A FastAPI web server for creating RSS feeds for scholarly journals with the magic of adversarial interoperability
GNU General Public License v3.0
60 stars 4 forks source link

check for retracted papers #25

Open smierz opened 6 months ago

smierz commented 6 months ago

Looking through OpenAlex docs and saw that they have a flag for "is_retracted". Could be a scheduled job (maybe once a month), checking for papers in DB, if the flag was set to true since fetching.

(low prio though)

sneakers-the-rat commented 6 months ago

That sounds simple enough to add as an extra field and make part of the backfill task :)

smierz commented 5 months ago

I'm imagining 2 scenarios here:

sneakers-the-rat commented 5 months ago

we'll be doing scheduled tasks (daily for all at first, but we can make finer scales per job if need be), so we can do both. Since the db of papers will probably grow pretty quickly, we probably want to schedule that check more rarely since the proportion of retracted papers will be low. we can also schedule tasks on demand, so what we might want to do is schedule that kind of task when a feed it fetched with some kind of debounce - so we are refreshing the papers that we are actually presenting on feeds to make sure that their state is correct.

edit: I'm taking a brief detour to make some more general models for activitypub so we can make feeds better on activitypub and do some of the fun social things with interactive feed generation, but will return to this since, well, i'm doing it for this project: models: https://github.com/p2p-ld/linkml-activitypub db: https://github.com/p2p-ld/pydantigraph api: https://github.com/p2p-ld/fastapi-activitypub