Closed mike-w-wilson closed 3 years ago
The pipeline writes out files for every step but not all are needed. Need to decide the fate of these. Move to nearline, cold, archive, or delete?
Eagle: Phased reference per chr Phased sample per chr
RFMix: msp file per chr fb file per chr (large and not used)
Tractor: VCF per chr (unzipped -- zip if keeping) Hap per chr (unzipped -- zip if keeping) Dos per chr (unzipped -- zip if keeping)
VCF generation: VCF with annotated call stats
We decided to keep MSP files and I also kept some testing files. All other data was deleted with the excecption of the final VCF which now resides in the requester pays bucket
The pipeline writes out files for every step but not all are needed. Need to decide the fate of these. Move to nearline, cold, archive, or delete?
Eagle: Phased reference per chr Phased sample per chr
RFMix: msp file per chr fb file per chr (large and not used)
Tractor: VCF per chr (unzipped -- zip if keeping) Hap per chr (unzipped -- zip if keeping) Dos per chr (unzipped -- zip if keeping)
VCF generation: VCF with annotated call stats