Closed philbudne closed 9 months ago
Maybe system_source_name
or mc_source_name
plus a comment, as it's carrying the source of that record within the larger mc system?
@thepsalmist any review comments?
@thepsalmist any review comments?
LGTM!
NOTE! Adds RSSEntry fields to preserve feed/source id data in CSV files. PLEASE comment on the question at https://github.com/philbudne/story-indexer/blob/hist-update/indexer/story.py#L119 (file_name field) Fetcher will also accept feed_url, feed_id, source_id from RSS file
<source>
tag. Added --rss-file option to fetcher, to test parsing new RSS files not yet in production. Set RSSEntry.fetch_date to the date in the name of the CSV file. Cleaned up version picking (not needed for 2023) to use date from CSV file column instead of fetch_date. hist-queuer: let hist-fetcher validate the downloads_id hist-fetcher: quarantine stories with invalid downloads_id parser: drop messages with no HTML (count only, no quarantine)