Add `mito_file` and `ref_fasta_index` to scpca checkpoint script

allyhawkins commented 1 year ago

In updating scpca-nf to be compatible with running samples from multiple organisms at once, we added new entries to the scpca-meta.json file that is produced from the workflow. Because we output and read in that file in different processes, and want to be able to skip mapping if ever re-processing projects that have already been mapped, we need that file to be up to date with the current workflow.

We should update the update_scpca_checkpoints.py script to add those two new entries and ensure that the rest of the contents of the metadata file are the same. Then we will need to run that script to update the existing scpca-meta.json files.

jashapiro commented 1 year ago

Just as to priority, this is not something we will need to do until we are reanalyzing/updating previous samples, right?

I was actually thinking this should include not just two entries, but all of the values that we put in the meta object from the sample_refs json file, for any future use.

allyhawkins commented 1 year ago

That's correct, it's only if we need to re-run any previous samples. We already have all of the other entries accounted for in the script you wrote previously, the only two that are not there are the ones I mentioned.

sjspielman commented 1 year ago

From DSTM 2023-07-05, we discussed perhaps writing a smaller script to just add these fields into JSONs and saving back to S3.

AlexsLemonade / alsf-scpca

Add `mito_file` and `ref_fasta_index` to scpca checkpoint script #173