mediacloud / rss-fetcher

Intelligently fetch lists of URLs from a large collection of RSS Feeds as part of the Media Cloud Directory.
https://search.mediacloud.org/directory
Apache License 2.0
5 stars 5 forks source link

Conditional HTTP fetching #19

Closed philbudne closed 1 year ago

philbudne commented 1 year ago

For initial review/comments/questions.

Implement conditional HTTP fetching for issue #16 using ETag/If-None-Match or Last-Modified/If-Modified-Since headers. If server determines the feed has not changed they return HTTP status code 304 "Not Modified" without a response body.

Implementing the basic functionality was easy; Integrating 304 response handling is more subtle. I've tried to leave comments about HTTP header semantics to explain my choices.

Note: alembic changes to allow database creation are in a single commit that I can get rid of.