Open tech3371 opened 10 months ago
A note for the first indexer.py issue: currently, using the API upload function to upload a L1A file does not result in that file appearing in query results. Once this issue is resolved, you should be able to upload an L1A file and see it in the query.
List of things we need to figure out and fix:
Infrastructure
indexer.py updates: #249
[x] Update s3 event rule to listen for all
imap/
folder. Right now, we have the rule to allow only l0 data to go through and because of that, it caused some issues. If our l0 data file contained multiple apid(s), our processing code could produce more than one files which won't know known by batch starter lambda.[x] Decouple StatusTracking and FileCatalog table now. Status table will track these information, instrument, level, upstream_file(?), and batch job informations. File catalog table will only store information about files that were processed and uploaded to s3 bucket.
batch_starter.py updates: #250
[x] Update
--dependency <data>
to send result from query API and may be filter outfile_path
from query result.[x] Update to use PreProcessingDependency table instead of .json file.
[x] Update to not construct
file_path_to_create
and instead change--file_path
toinput_file_path
(discuss it).[x] Update event input and rule to match changes from above.
[ ] How to track version and coordinate with imap_processing repo.
path_helper.py updates: #251
[x] improve init functions of ScienceFilepathManager.py. Maxine has already started this work. moving this function to sds-data-access repo or package.
[x] Use and verify that upload API lambda is able to use sds-data-access's filepath validator.
imap_processing: #252
[x] Based on what we decide to pass as command from batch_starter.py to Batch job, we need to update cli.py. Maxine has done this work in https://github.com/IMAP-Science-Operations-Center/imap_processing/pull/341
[x] May be write generic functions to upload and download processed files. and function to read cdf file? Maxine has done this work in https://github.com/IMAP-Science-Operations-Center/imap_processing/pull/341
[ ] Improve folder structures to simplify imports
[ ] Discuss what is version we are tracking on this repo
Create a new issue for there versioning task