adsabs / ADSDocMatchPipeline

Pipeline to match publisher document with preprint counterpart and vice versa
MIT License
1 stars 4 forks source link

oracle_util needs to be aware of new filename rules #11

Closed seasidesparrow closed 4 months ago

seasidesparrow commented 1 year ago

Currently, when trying to upload xlsx files to oracle db, oracle_util is looking for names of the form '.compare' for daily eprint matching, and '.pubcompare' for weekly publication matching. The new styles for these filenames will have 'compare_eprint' and 'compare_pub' respectively. Because these are adapted from the config variables DOCMATCHPIPELINE_EPRINT_COMBINED_FILENAME and DOCMATCHPIPELINE_PUB_COMBINED_FILENAME, then L356 and L358 should probably do string matching based on the contents of these variables, rather than a fixed string.

seasidesparrow commented 4 months ago

Curated spreadsheet handling has been deprecated as of 2024 June.