Closed simonw closed 4 years ago
Here are the two CSV files that should have been imported: https://biglocal.datasettes.com/biglocal/files?project=UHJvamVjdDpiMGVmMjIyYS0zNzE4LTRhZTgtYWJjNC1lNzA3M2M0MDFmZGQ%3D&_facet=ext&ext=csv
project | project_label | ext | createdAt | name | updatedAt | uri | uriType | size | etag |
---|---|---|---|---|---|---|---|---|---|
UHJvamVjdDpiMGVmMjIyYS0zNzE4LTRhZTgtYWJjNC1lNzA3M2M0MDFmZGQ= | COVID_CDC_SVI | csv | 2020-03-18T23:40:26.279000+00:00 | SVI2018_US.csv | 2020-03-18T23:40:26.279000+00:00 | download | 44704458 | """34dc9bc76fc77f227af4d4ffe4e69c15""" | |
UHJvamVjdDpiMGVmMjIyYS0zNzE4LTRhZTgtYWJjNC1lNzA3M2M0MDFmZGQ= | COVID_CDC_SVI | csv | 2020-03-18T23:40:02.307000+00:00 | SVI2018_US_COUNTY.csv | 2020-03-18T23:40:02.307000+00:00 | download | 1885419 | """01ce905c04ddf3b7bff299f0dbc05543""" |
That fixed it. https://biglocal.datasettes.com/COVID_CDC_SVI
... and it's gone again! https://biglocal.datasettes.com/COVID_CDC_SVI
https://github.com/simonw/big-local-datasette/runs/590623330?check_suite_focus=true
total 31M
drwxr-xr-x 2 runner docker 4.0K Apr 15 23:53 .
drwxr-xr-x 7 runner docker 4.0K Apr 15 23:53 ..
-rw-r--r-- 1 runner docker 80K Apr 15 23:53 COVID_AHA_Hospital_beds.db
-rw-r--r-- 1 runner docker 0 Apr 15 23:53 COVID_CDC_SVI.db
-rw-r--r-- 1 runner docker 512K Apr 15 23:53 COVID_COVID19Tracker.db
-rw-r--r-- 1 runner docker 13M Apr 15 23:53 COVID_HospitalBeds_CountyDemographics_NursingHomes.db
-rw-r--r-- 1 runner docker 3.9M Apr 15 23:53 COVID_National_Health_Security_Preparedness_Index.db
-rw-r--r-- 1 runner docker 14M Apr 15 23:53 COVID_USAFacts_county_cases.db
-rw-r--r-- 1 runner docker 0 Apr 15 23:53 COVID_twitter_data.db
-rw-r--r-- 1 runner docker 156K Apr 15 23:53 biglocal.db
-rw-r--r-- 1 runner docker 2.2K Apr 15 23:53 databases.json
-rw-r--r-- 1 runner docker 21K Apr 15 23:53 metadata.json
I'll try re-running each step from the Action on my laptop to see if I can replicate what's happening.
Part of the problem is here:
https://github.com/simonw/big-local-datasette/blob/c9a8908a8b214950d17d4dac30d8697b8019e8ce/populate_tables.py#L39-L41
Once a 0 byte file is on disk, it will be skipped in the future because the hash in the local copy of databases.json
stayed the same.
I'm still not sure how we got the 0 byte files in the first place though!
I'm going to say "always download if the local DB file is missing or 0 bytes".
https://biglocal.datasettes.com/COVID_CDC_SVI is still empty.
I found the tables! For some reason they ended up in the incorrect database:
https://biglocal.datasettes.com/COVID_twitter_data
Plus the debug output says: https://github.com/simonw/big-local-datasette/runs/594103343?check_suite_focus=true
Fetching SVI2018_US into DB COVID_AHA_Hospital_beds
SVI2018_US 44704458
Fetching SVI2018_US_COUNTY into DB COVID_AHA_Hospital_beds
SVI2018_US_COUNTY 1885419```
https://biglocal.datasettes.com/COVID_CDC_SVI has one table now, but it should have two.
Likewise https://biglocal.datasettes.com/COVID_twitter_data has one table, but it should have 5 - maybe they are too big? https://biglocal.datasettes.com/biglocal/files?_facet=project&project=UHJvamVjdDo4NTBjOWJmYy03YzAyLTRkNDgtYjYzMS04OThhODFmZjQxNDQ%3D&_facet=ext&ext=csv#facet-project
Fixed!
There are two CSV files in that project, but this is happening: https://biglocal.datasettes.com/