dnanexus-rnd / GLnexus

Scalable gVCF merging and joint variant calling for population sequencing projects
Apache License 2.0
142 stars 37 forks source link

Resuming Merging of GVCFs #256

Closed meghatron21 closed 3 years ago

meghatron21 commented 3 years ago

Hi.

I am trying to merge ~800 GVCFs that are WGS. It takes some time with GLNexus (14 days). Unfortunately, the job failed after this time; however, the GLnexusDB is still present and I was curious if there was a way to resume progress with the database so progress is not lost.

Thanks, Meghana

mlin commented 3 years ago

I'll refer you to this thread for context if that's alright! https://github.com/dnanexus-rnd/GLnexus/issues/233

From the specifics you provided, I'd strongly consider first using tabix to split up the GVCFs by chromosome, then merging them in 23/24 smaller glnexus_cli jobs (which could even run in parallel, resources permitting).