Open zaneselvans opened 6 months ago
I noticed this when I was recently working on the FERC archivers, and what's happening is that the main RSS feed contains only the most recent filings, while older filings can only be found in month specific feeds. This leads to some collisions where recent filings are available in a month specific feed, and the main feed. They should be the exact same filing, it shouldn't really be a problem, but I think it would be best to fix this and raise an error if we see unexpected duplicates.
I tried running the FERC Form 1 archiver locally and saw a number of warnings about duplicate filenames in zipfiles. E.g.
Do we expect there to be filename collisions?