alexdobin / STAR

RNA-seq aligner
MIT License
1.77k stars 495 forks source link

STAR Diploid Genome Generate Feature Request #2157

Open migbro opened 1 week ago

migbro commented 1 week ago

Hi! It seems that when creating a reference using genomeTransformVCF, the first step is to create the standard STAR reference without taking the input VCF into consideration, then put it in a subdirectory called OriginalGenome, then create the personalized genome. I don't believe that the contents of OriginalGenome would change person-to-person, therefore I would like to request the ability for the user to provide an existing OriginalGenome at run time to save about an hour of compute. It really adds up in cloud computing costs if running across many samples. I suppose conversely, as a hack, would it work to just save only the root directory of the generated personal reference, then when a user wants to align, just throw in the standard reference, maintaining the same directory structure? This would also space on storage space and cost, as you'd have an extra 25GB of repeat reference in every PG otherwise. Does this make sense?