Closed mradamcox closed 1 week ago
I made use of a (very lightly modified) version of the upload_to_s3
function from utils.py
in order to follow the pattern from the census.py
upload. The main potential problem I see with this is that the key for the S3 hosted jcoin data is dependent on the destination the jcoin data is being sent to, as most direct additions of a purely hardcoded key variable would break the other usage of upload_to_s3
.
I'm not sure that we do want a hardcoded name, but if we do, there's a few possible approaches I see (although they all feel sufficiently far afield of the direct task that I'd want feedback before implementing any of them):
upload_to_s3
to handle the cases where it is passed a single path versus a list of paths separately._ This is doable but likely to result in a hard to read function.upload_to_s3
could be renamed upload_files_to_s3
to differentiate the two functions. This likely results in the most readable code, so I think I lean towards it.clients/jcoin.py
directly. This is plausible but would feel contradictory to the current organization of the Flask CLI. Ok, I think we should handle this by getting more opinionated about where the exports go and how they are named. For now:
-d/--destination
is specified (it is an optional argument, even though I didn't treat it as such...), the data package will go within current_app.config['CACHE_DIR'] / 'data-packages'
, and it will be named oeps-data-package-v2_YYYY-MM-DD
.
_no_foreign_keys
to this nameDefault behavior should be to overwrite an existing export with the same name (i.e. from the same day). Or, with a little more work, default behavior would check if the output directory exists and prompt user input to confirm overwrite, while an --overwrite
flag would skip that prompt. I don't see that as crucial but may be nice to add. The prompt could go in utils and be reused, etc.
For upload_to_s3
let's modify to accept one path or a list of paths, and use isinstance(paths, list)
to check and turn a single path into a list.
Overall, do you think this approach would work?
Added the default destination bit and created an --overwrite
flag! Current behavior is that if there is overwrite risk (i.e. the destination exists and is non-empty) the user is prompted.
Using the existing
jcoin create-data-package
command, add a new flag that uploads the zipped output (created if the--zip
argument is used) directly to S3. Follow the pattern that is in for S3 uploads in theclients/census.py
operations. Use the same S3 bucket, and place within the/oeps/
prefix.