Closed krivard closed 3 years ago
The file in question: 20200828_state_deaths_7dav_cumulative_prop.csv.zip
I can take a look at this -- can you point me at the code where this error occurs?
here's where the correct handling example is logged: https://github.com/cmu-delphi/delphi-epidata/blob/801f2729ea80b42891aa282c858c489a47049082/src/acquisition/covidcast/csv_importer.py#L302
it's unexpected that a file would be archived as failed without any corresponding log message describing the failure reason.
Thanks David -- that looks like enough to start debugging the issue. To understand more broadly, where does this code get called from? In one of the automation steps?
yep, that's right. here's how it happens:
task#33
runs step#63
every hourstep#63
causes the steps in flow#7
to run sequentiallyflow#7
is step#60
, which runs the python file csv_to_database.pycsv_to_database
imports and calls CsvImporter
from file csv_importer.pyStill debugging, adding some findings:
cc_rows = CovidcastRow.fromCsvRows(csv_rows, source, signal, time_type, geo_type, time_value, issue, lag, is_wip)
returns a list of rows from the file:ak 4.7801364393158705 None None
al 40.11095411888978 None None
ar 23.64538679687046 None None
az 66.50701137105816 None None
...
deaths_7dav_cumulative_prop False
line -- this is the print(signal, is_wip)
from line 80.I suspect that the root cause of the problem may be that there is an infinite value in the data: pr inf None None
-- my best guess right now is that this causes the mysql insert to fail, causing a return value of 0 rows. In the case where 0 is returned this falls through to archive_as_failed
which would match the error messages in this issue.
Fixed in PR.
There is some failure case that's not being adequately logged:
Correct handling of a failed CSV is logged like this: