dnanexus-rnd / GLnexus

Scalable gVCF merging and joint variant calling for population sequencing projects
Apache License 2.0
145 stars 37 forks source link

Resume GLNexus for an existing GLnexus.DB database #152

Open raonyguimaraes opened 5 years ago

raonyguimaraes commented 5 years ago

Is there a way to reuse the created GLnexus.DB database after an unsuccessful run?

My bed file was empty and because of that, it didn't call any variants although it created the database for calling the variants. Is there a way to rerun glnexus using the same database? Right now if I try to run it again it complains about the existing folder instead of trying to use the database that it already has created.

Is there a way to bypass this check and use the existing database?

Thanks for all the help!

mlin commented 5 years ago

This capability isn't currently exposed through the open-source driver program but it is possible to go hacking in there. The DNAnexus-native version of course is better suited to projects so large that the load step is a major burden. :wink:

xshah commented 4 years ago

is there a way for the DNAnexus native version to reuse a pre-created GLnexus.DB? This would be worth it 📦

mlin commented 4 years ago

That's the idea yea; although, because it also has the ability to shard the cohort across many compute nodes, it can reduce the turnaround time enough that only the very largest projects seem to find it worthwhile to keep the intermediate databases around for incremental update.

hurleyLi commented 3 years ago

Hi @mlin , just wondering has this feature of incremental update using existing GLnexus.DB been implemented on the command line version of glnexus_cli? If yes, could you please show us an example command? Thanks

hurleyLi commented 3 years ago

Hi @mlin, I'd like to follow up on this issue about incremental merging from existing GLnexus.DB. Is this something you and your team plan to incorporate on the command line version of glnexus_cli? Thanks!