exomiser / Exomiser

A Tool to Annotate and Prioritize Exome Variants
https://exomiser.readthedocs.io
GNU Affero General Public License v3.0
194 stars 54 forks source link

Allow trio input as separate VCF files / add tutorial on creating multi-sample VCF. #66

Open damiansm opened 9 years ago

damiansm commented 9 years ago

Should we allow the ability to upload patient and parent exomes as 3 separate VCF files rather than a multi-sample. Or is it easy enough for people to use vcf_merge?

pnrobinson commented 9 years ago

That would make it easier for many users. We should definitely also consider working out a really nice online tutorial. There are a number of nice templating systems, I think Manuel has used several of them?

visze commented 9 years ago

With single vcf we will not have a combined calling. This will reduce the performace!

Of course, uploading them separately will make it more easier for lot of users. But it will reduce the hurdle to make a combined calling.

I think providing a tutorial how you can generate multivcf files from single once and how you can do a multiple calling and discuss the benefit of it will be my favorite option.

damiansm commented 9 years ago

As ever I am ignorant of the steps before Exomiser.

Max - are you talking about the reduction in performance of variant calling when you use single VCFs as you can can't take advantage of the fact that a variant is seen in a parent and child to strengthen the case for it being real?

Sounds like a simple tutorial on how to use vcf_merge or other tools is a better option than implementing trio input as separate VCFs?

visze commented 9 years ago

@damiansm no. I mean 3 variant calling on its own is different to multiple vcf calling on the complete trio at the same time. If separate variant calls are made we do not know if a mutation is ref or not covered in the parents. This increase the false positive rate.

Merging will always be vcf calls on its own and therefore not the best way.

But in reality, clinicians only have one vcf per person And do not have the ability to make a combined calling. These clinicians will be stuck and I like your idea to help these guys. Otherwise they will only upload the index and look up the mutations in the parents... much work...

pnrobinson commented 9 years ago

Using separately called VCF files would be a reasonable option if they are gVCFs. It would be better than nothing if they are plain VCFs. We should maybe put a tutorial on how to do things "right" but also offer a quick solution with a warning...?

williakd17 commented 5 years ago

Can Exomiser handle a 3 generation family (daughter, mom, and grandma)? My current understanding of how a trio would be run is creating a ped file, merging the vcfs together, and only using the proband's HPO IDs. Is that correct?

I guess I do not quite fully understand what issues would arise from merging the vcfs together. You would still be retaining each sample's variant calls and associated data (http://samtools.github.io/hts-specs/VCFv4.1.pdf), so I'm not quite sure why this would affect performance outside of a convenience standpoint.

williakd17 commented 4 years ago

Has there been any update on this? Allowing vcfs to remain unmerged would be beneficial.