IMAP-Science-Operations-Center / sds-data-manager

MIT License
0 stars 9 forks source link

ENH: Enable SPICE file uploads to the SDC #264

Closed greglucas closed 1 month ago

greglucas commented 2 months ago

Change Summary

Overview

This takes in any SPICE suffix in the filename, tests to see if there is already a file with that name (returns 409 if so), and then adds that file to the flat namespace imap/spice/filename. There is a bit of additional refactoring of the upload code to re-use the pre-signed url and responses in the SPICE section.

Tested by creating a .ck file suffix and uploading that, which put the file in s3 in the place expected. Ran it again and got a 409 conflict response.

touch /tmp/test-spice-file.ck
IMAP_DATA_DIR=/private/tmp imap-data-access upload /tmp/test-spice-file.ck

This brought up the fact that we require SPICE files to also be within the IMAP_DATA_DIR (from within imap-data-access), do we want to enforce that for uploads too? Add an IMAP_SPICE_DIR, or just allow the wild west for file uploads? This can be fixed in follow-up PRs, but this one got me to think about it.

subagonsouth commented 2 months ago

Tested by creating a .ck file suffix and uploading that, which put the file in s3 in the place expected. Ran it again and got a 409 conflict response.

Is this the only test coverage for this code? I poked around in the repo a bit and found a test_download_api.py file but no apparent coverage for the upload api.

greglucas commented 2 months ago

Tested by creating a .ck file suffix and uploading that, which put the file in s3 in the place expected. Ran it again and got a 409 conflict response.

Is this the only test coverage for this code? I poked around in the repo a bit and found a test_download_api.py file but no apparent coverage for the upload api.

Our test coverage of infrastructure is definitely lacking. Thank you for bringing this up. I will update this because I agree with you that we should do better when we can and I don't think it will be too hard to add here.

greglucas commented 1 month ago

@subagonsouth thanks for bringing up the lack of testing. I took some time and added that here now. Unfortunately, that blew things up a bit in scope as our tests were somewhat stepping on each other. So I ended up refactoring a bunch of minor things that were needed to get the upload tests working due to monkeypatching.

@tech3371 and @laspsandoval I dismissed your reviews because there are significant changes since you last reviewed.

The most controversial change is that I have removed the database check for existing file from upload_api. Now we are checking the s3 bucket if there is an object in the data bucket or not. Who should be the source of truth, the data bucket or the database table? I find this to be easier (avoids a database connection/query), but it is a change from what we were doing.