neuml / paperetl

📄 ⚙️ ETL processes for medical and scientific papers
Apache License 2.0
352 stars 27 forks source link

Detect month changes in CORD-19 entry date process #33

Closed davidmezzetti closed 3 years ago

davidmezzetti commented 3 years ago

Currently, the entry date download process assumes there is a metadata.csv file for each day. Since the datasource changed to biweekly updates, there may not be a metadata.csv file for the 1st of the month. Add logic to detect month changes and use the earliest metadata.csv file per month instead.