internetarchive / fatcat-scholar

search interface for scholarly works
https://scholar.archive.org
Other
77 stars 14 forks source link

RSS feeds for search queries #37

Open bnewbold opened 3 years ago

bnewbold commented 3 years ago

This feature would allow creation of RSS feed endpoints for any search query. The feed would allow users to "subscribe" to new search hits.

Some implementation thoughts:

sckott commented 3 years ago

I'd be happy to test if implemented

bnewbold commented 3 years ago

Re-indexing is finally caught up, and the "papers from the past week" type of query should work, so starting to think about this. And i'm excited!

This library seems like a great super-simple way to implement an RSS feed in fastapi; though maybe Atom is preferred?: https://pypi.org/project/fastapi-rss/

Presumably would have a small jinja2 template to render a summary with the existing macros into HTML, and inject that into the items.

Would probably be two new endpoints: a form to help craft a query, with "feed-specific" query parameters, and an RSS endpoint itself (XML).

I think the default parameters should be:

Then in the generation page have a form for other filters, and a free-form query box (same as the regular search).

Run the query with the same current routine, take the results, transform, and return as the feed.

sckott commented 3 years ago

I don't have a preference between RSS and atom.

Plan sounds great to me

bnewbold commented 2 years ago

I just pushed a minimal version of this. On search result pages, there is an "RSS Feed" link under the search box, which goes straight to an RSS 2.0 file with the search parameters.

I tested with my feed reader and it seems to be working over the past week. Any feedback welcome!

sckott commented 2 years ago

Awesome! Thanks for getting this working. My original use case is gone with changing jobs, but this will be useful for tracking different paper topics in Feedbin

lstrtstr commented 1 year ago

Any feedback welcome!

Many RSS feeds of journal searches seem to have no entries. Is that on purpose or a bug?

For example, if I search for journal:Catena, I get results as recent as 2022 (https://scholar.archive.org/search?q=journal%3ACatena&sort_order=time_desc). The corresponding RSS feed https://scholar.archive.org/feed/rss?q=journal%3ACatena has no entries, though.