Hi! It seems that when creating a reference using genomeTransformVCF, the first step is to create the standard STAR reference without taking the input VCF into consideration, then put it in a subdirectory called OriginalGenome, then create the personalized genome. I don't believe that the contents of OriginalGenome would change person-to-person, therefore I would like to request the ability for the user to provide an existing OriginalGenome at run time to save about an hour of compute. It really adds up in cloud computing costs if running across many samples. I suppose conversely, as a hack, would it work to just save only the root directory of the generated personal reference, then when a user wants to align, just throw in the standard reference, maintaining the same directory structure? This would also space on storage space and cost, as you'd have an extra 25GB of repeat reference in every PG otherwise. Does this make sense?
Hi! It seems that when creating a reference using
genomeTransformVCF
, the first step is to create the standard STAR reference without taking the input VCF into consideration, then put it in a subdirectory calledOriginalGenome
, then create the personalized genome. I don't believe that the contents ofOriginalGenome
would change person-to-person, therefore I would like to request the ability for the user to provide an existingOriginalGenome
at run time to save about an hour of compute. It really adds up in cloud computing costs if running across many samples. I suppose conversely, as a hack, would it work to just save only the root directory of the generated personal reference, then when a user wants to align, just throw in the standard reference, maintaining the same directory structure? This would also space on storage space and cost, as you'd have an extra 25GB of repeat reference in every PG otherwise. Does this make sense?