Closed KlaasH closed 1 year ago
I just tested your script, but there is still an issue:
+ echo 'DONE: Importing neighborhood_boundary'
+ '[' nonus == USA ']'
+ update_status IMPORTING 'Downloading census blocks'
+ echo 'Updating job status: IMPORTING' 'Downloading census blocks'
+ '[' -n '' ']'
+ '[' nonus == USA ']'
+ NB_BLOCK_FILENAME=population
+ S3_PATH=s3:///data/population.zip
+ '[' -f /data/population.zip ']'
+ echo 'Using local census blocks file'
+ BLOCK_DOWNLOAD=/data/population.zip
+ unzip /data/population.zip -d /tmp/tmp.gBCTxzkk8v/import_neighborhood
+ '[' nonus '!=' USA ']'
+ cd /tmp/tmp.gBCTxzkk8v/import_neighborhood
+ rm -f population.zip
+ for x in *
+ mv -u population.cpg population.cpg
mv: 'population.cpg' and 'population.cpg' are the same file
2022-08-23 20:59:46.788 UTC [14] LOG: received fast shutdown request
2022-08-23 20:59:46.789 UTC [14] LOG: aborting any active transactions
2022-08-23 20:59:46.796 UTC [14] LOG: background worker "logical replication launcher" (PID 254) exited with exit code 1
2022-08-23 20:59:46.797 UTC [247] LOG: shutting down
2022-08-23 20:59:46.867 UTC [14] LOG: database system is shut down
You may want to use mv -f
instead then.
Once again my test case wasn't as good as I thought it was--I was using population.zip
, but still not a version with the files inside also named "population". When I try to mv -f
a file to itself locally, I still get the "same file" error, but it looks like -n
(no-clobber) will do the job. I pushed a fixup commit, and this time I definitely had a population.zip
for testing that failed the old way but worked the new way, so hopefully this does the trick.
It worked like a charm 👍
+ update_status IMPORTING 'Downloading census blocks'
+ echo 'Updating job status: IMPORTING' 'Downloading census blocks'
+ '[' -n '' ']'
+ '[' nonus == USA ']'
+ NB_BLOCK_FILENAME=population
+ S3_PATH=s3:///data/population.zip
+ '[' -f /data/population.zip ']'
Using local census blocks file
+ echo 'Using local census blocks file'
+ BLOCK_DOWNLOAD=/data/population.zip
+ unzip /data/population.zip -d /tmp/tmp.TM85eALmZj/import_neighborhood
Archive: /data/population.zip
extracting: /tmp/tmp.TM85eALmZj/import_neighborhood/population.shx
extracting: /tmp/tmp.TM85eALmZj/import_neighborhood/population.cpg
extracting: /tmp/tmp.TM85eALmZj/import_neighborhood/population.shp
extracting: /tmp/tmp.TM85eALmZj/import_neighborhood/population.dbf
extracting: /tmp/tmp.TM85eALmZj/import_neighborhood/population.prj
+ '[' nonus '!=' USA ']'
+ cd /tmp/tmp.TM85eALmZj/import_neighborhood
+ rm -f population.zip
+ for x in *
+ mv -n population.cpg population.cpg
+ for x in *
+ mv -n population.dbf population.dbf
+ for x in *
+ mv -n population.prj population.prj
+ for x in *
+ mv -n population.shp population.shp
+ for x in *
+ mv -n population.shx population.shx
+ cd -
/opt/pfb/analysis/scripts
+ update_status IMPORTING 'Loading census blocks'
Overview
When downloading a population file from S3, the file gets copied to the temporary directory and needs to be cleaned up, but when using the default local file (
/data/population.zip
) it doesn't, so therm
command to remove it after extraction fails. Also, the command to rename the unzipped files to the default filename is necessary if they're not already named that (e.g. if the file is calledpopulation.zip
but the files inside are called something else), but it crashes if they are. This changes both commands so they'll still do their thing but will be more tolerant of the files being missing or already named correctly.Notes
This is an alternative to PR #902. I believe it addresses the issue raised there but preserves the cleanup that's needed when the population file is downloaded rather than loaded from the default local path.
Testing Instructions
PFB_POP_URL
value provided so that it gets the population file from/data/population.zip
. It should get past the place inimport_neighborhood.sh
that was crashing.PFB_POP_URL
provided. It should load the file from that location and should also get past that spot inimport_neighborhood.sh
.