Closed cutright closed 3 years ago
Might move to this library: https://github.com/scrapinghub/dateparser
I think the solution is to have IQDM-PDF write file creation dates into its CSV output, then have IQDMA default to that on date parsing failure (or maybe try a day first mode first). dateparser
is definitely slower than python-dateutil
Flipping day and month does not cause the error above:
>>> from dateutil.parser import parse as date_parser
>>> date_parser('3/30/2020')
datetime.datetime(2020, 3, 30, 0, 0)
>>> date_parser('30/3/2020')
datetime.datetime(2020, 3, 30, 0, 0)
If not fixed in d7d6bcbbb472fffb59a522155a108cb6cdc9711d, appears to be fixed in e0f632f271a13102e437c182a9ac7ff86c2e72b5
Reproduced the issue with SNCPatient2020
results from IQDM-PDF
. Occurs with the date_col
is empty for a row of data. Since there is a file creation timestamp now, code should default to this when no date data was parsed.
A user is reporting the following error. Some SNC Patient (pre-2020) reports are swapping month and day.
Should update
widen_data
to handle flipped dates, but ideally we can detect the swapping from the report itself in IQDM-PDF.