Clinical-Genomics / genmod

Annotate models of genetic inheritance patterns in variant files (vcf files)
http://moonso.github.io/genmod/
MIT License
74 stars 17 forks source link

Annotating with CADD Scores #43

Closed lukewbonham closed 8 years ago

lukewbonham commented 8 years ago

Hi,

I'm having trouble using the option to annotate CADD scores with my VCF files. I tried annotating the example data with my copy of the 1000G.tsv.gz file and was successful, but couldn't get the same result with my own files. I used the command

genmod annotate sample_111.vcf --cadd_file 1000G.tsv.gz --out 111_annotated.vcf

and found "." in the INFO field for all of my snps. The vcf header lists CADD as one of the variables in the INFO section though, so I'm not sure what is going wrong. I checked to see if the vcf versions were the same as the genmod examples and both are v4.1. I am using genmod v3.3.2.

Do you have any ideas on why this might be happening? I can send example files directly if needed.

Thanks!

moonso commented 8 years ago

Hello,

is your file 1000G.tsv.gz a huge CADD file? Think you should send some example files and I can try it out.

Måns

lukewbonham commented 8 years ago

Hi,

Thanks for your help! Yes, the 1000G.tsv.gz is a huge CADD file.

What is your preferred way to receive the example files?

Thanks!

moonso commented 8 years ago

Maybe a vcf with the original header and a couple of lines that should be annotated and some lines from your cadd file that overlap with the variants.

It is a bit strange since we use this tool daily and it works as it should, it is hard to guess what is not working in your case... Does your cadd file have an index file with the same name?

lukewbonham commented 8 years ago

The cadd file is 1000G.tsv.gz and index file is named 1000G.tsv.gz.tbi. The names haven’t been altered since they were downloaded.

Do you want examples shared on dropbox or another method?

Thanks

moonso commented 8 years ago

Sure dropbox is fine. Think I am mans.magnusson or something

lukewbonham commented 8 years ago

Thanks. I couldn’t find the alias you provided below in dropbox so shared a link to the files with the email associated with your Github account.

I checked whether the variants we are interested in are in the 1000G.tsv.gz file and it looks like a good portion are, so hopefully we can figure out what is going on. Let me know if there is anything I can do to help!

moonso commented 8 years ago

Hello Luke,

there was a problem with genmod annotate when the chromosome name is chr1 instead of 1etc. This should be fixed now in version 3.3.3 so please update and try again.

Måns

lukewbonham commented 8 years ago

Hi Måns,

I upgraded to 3.3.3 and the results look as expected for the entries on chr1! However, the entries are blank for the variants we have on all the other chromosomes, could this be the same problem? Thanks for your help!

Luke

moonso commented 8 years ago

Ok so I ran a test on the file you sent with my 1000G.tsv file and the result looks fine, it is in dropbox. I used the command: `genmod annotate ~/Dropbox/example_vcfs_cadd/C08-AGTGGTCA.vcf -c data/annotation/1000G.v1.1.tsv.gz -o ~/Dropbox/example_vcfs_cadd/C08-AGTGGTCA_annotated_mans.vcf``

I'm not sure why it does not work for you, should be something with your CADD?

lukewbonham commented 8 years ago

It could be the CADD file I am using. I am using version 1.3 and, based on your command, it looks like you might have used version 1.1? I’ll download version 1.1 and see if that works.

From: Måns Magnusson [mailto:notifications@github.com] Sent: Tuesday, October 20, 2015 10:55 AM To: moonso/genmod Cc: Bonham, Luke Subject: Re: [genmod] Annotating with CADD Scores (#43)

Ok so I ran a test on the file you sent with my 1000G.tsv file and the result looks fine, it is in dropbox. I used the command: genmod annotate ~/Dropbox/example_vcfs_cadd/C08-AGTGGTCA.vcf -c data/annotation/1000G.v1.1.tsv.gz -o ~/Dropbox/example_vcfs_cadd/C08-AGTGGTCA_annotated_mans.vcf`

I'm not sure why it does not work for you, should be something with your CADD?

— Reply to this email directly or view it on GitHubhttps://github.com/moonso/genmod/issues/43#issuecomment-149647196.

lukewbonham commented 8 years ago

Just ran the test and have good news -- I have the same output as you when using CADD version 1.1. The previously mentioned problem where chromosomes 2-Y are unannotated when using version 1.3. I’ll put the example in dropbox.

Thanks again for all of your help!

From: Måns Magnusson [mailto:notifications@github.com] Sent: Tuesday, October 20, 2015 10:55 AM To: moonso/genmod Cc: Bonham, Luke Subject: Re: [genmod] Annotating with CADD Scores (#43)

Ok so I ran a test on the file you sent with my 1000G.tsv file and the result looks fine, it is in dropbox. I used the command: genmod annotate ~/Dropbox/example_vcfs_cadd/C08-AGTGGTCA.vcf -c data/annotation/1000G.v1.1.tsv.gz -o ~/Dropbox/example_vcfs_cadd/C08-AGTGGTCA_annotated_mans.vcf`

I'm not sure why it does not work for you, should be something with your CADD?

— Reply to this email directly or view it on GitHubhttps://github.com/moonso/genmod/issues/43#issuecomment-149647196.

moonso commented 8 years ago

:+1: