MelbourneGenomics / cpipe

The open source version of the Melbourne Genomics Health Alliance Exome Sequencing Pipeline
Other
33 stars 14 forks source link

What replaced annovar? #214

Open biocyberman opened 6 years ago

biocyberman commented 6 years ago

https://github.com/MelbourneGenomics/cpipe/blob/dev/docs/outputs.md This document says annovar is deprecated. That caught me by surprise because annovar is still frequently found in many places in Cpipe's existence: publication, features, user guide, roadmap, etc.

Including of an inhouse database based on annovar in the offer of Cpipe is the most important selling point to me. Therefore it will be great if Cpipe team could explain why annovar is deprecated, and what can be done for the need of tracking variants across batches.

If it is about licensing issue, it is not a problem for non-commercial users.

Thanks

ssadedin commented 6 years ago

I didn't make the decision directly, but I think it was a few different factors probably contributed: partly licensing (which isn't a problem for some users, but definitely a problem for others), but also performance (was taking a serious amount of time to run as we added more annotations), a general feeling that the ecosystem around VEP was a bit stronger, and better VCF output compatibility (at the time).

Tracking variants across batches is also unfortunately dropped with Cpipe 2.3. It's a useful feature but our users are tracking variants in dedicated downstream variant database tools, so it was redundant for us. The code to build the variant database is still in Cpipe and it could be enabled as an optional feature without too much effort.

biocyberman commented 6 years ago

@ssadedin Thanks for the information.

a general feeling that the ecosystem around VEP was a bit stronger, and better VCF output compatibility

Agree, considering who stand behind the tool. However, the two tools are not interchangeable in terms of features (i.e. the database feature).

Tracking variants across batches is also unfortunately dropped with Cpipe 2.3. It's a useful feature but our users are tracking variants in dedicated downstream variant database tools, so it was redundant for us.

Do you and your colleagues use LOVD or something else?

The code to build the variant database is still in Cpipe and it could be enabled as an optional feature without too much effort.

I could see it by checking out v2.3 in the master branch. I am running off v2.5.1 on the dev branch. So I guess some git cherry-pick and merge has to be done, if we want annovar again. But as you said, and from what I understood about the way annovar and variants list are maintained (copying database for reproducibility at every run): it is an issue with scalability and maintainability. Hope to find a better way for this.