AllenNeuralDynamics / aind-smartspim-stitch

Stitching and fusion pipeline in the cloud
MIT License
3 stars 1 forks source link

Metadata JSONs don't carry over #30

Closed miketaormina closed 1 year ago

miketaormina commented 1 year ago

I think we are missing the aind-data-schema defined JSON files in the output data asset folders generated by this repo. Files such as subject.json, acquisition.json, etc. do not carry over to the new folder on AWS.

Finding these files can be something like this (assuming it is a TeraStitcher method and raw versions are in the self.input_data directory):

from aind_data_schema.base import AindCoreModel

files_to_find = [cls.default_filename() for cls in AindCoreModel.__subclasses__()]
files_to_ignore = ['processing.json', ] # whatever is produced by this repo that you don't want conflicts on
files_to_find = [self.input_data.joinpath(f) for f in files_to_find if f not in files_to_ignore]

found_files = [f for f in files_to_find if f.exists()]

Then copy the files to the output directory with shutil whatever you need to use for s3 writing.

Since there is no way for me to do any testing on the capsule, I won't make a branch or try to implement this myself.

camilolaiton commented 1 year ago

Hello @miketaormina, thank you very much for this proposal. I certainly agree with what you propose here, I'll take a look at the data conventions and talk to David just to make sure we are copying the correct metadata or if I have to do any modifications to them just like in the data_description.json. Will come back to you soon!

camilolaiton commented 1 year ago

Closing this issue since this was fixed for both branches.