Closed mtejura closed 1 year ago
Can you paste some lines of your VCF file so I can troubleshoot? (The header and then first 5 variants will be sufficient.)
This is how my vcf file looks.
Thank you so much for helping troubleshoot!
19 38433834 . G C Uncertain significance . . 19 38433853 . C A Uncertain significance . . 19 38433867 . T G Likely pathogenic . . 19 38440748 . G A Uncertain significance . . 19 38440775 . A C Uncertain significance . .
##fileformat=VCFv4.1
I just tried adding that header, but the annotator still does the same thing.
^ oh and i should mention my file is tab-delimited.
Is this human, and if so, which build? Can you send me your vcf file, and I will take a look.
Yes, it's human and the build is GRCh38! The github comment won't support the vcf file, where can I send it to you? Thank you!
Can you put it in dropbox and send me a link?
I was able to annotate your file just fine (I added the header line ##fileformat=VCFv4.1
)
My command was:
java -jar SIFT4G_Annotator.jar -c -i ClinVar_RYR1_tab_4.vcf -d Databases/ -r res -t
Databases/
contained chr19 from GRCh38.83
19.gz
19.regions
I ran the same file (with the header) using the command line and the GUI, but all I get is "NA" for the SIFT score and significance. Could it be possible that these variants just aren't represented? When you run the file, do you get annotations for all the variants (including the score?). Also when I click on the hyperlink for your GRCh38.83 it says "Not Found". I've been using GRCh38.78. Is there a difference?
The variants are represented -- all of them had predictions.
Its sounds like your database isn't loaded. Make sure to download GRCh38 database and extract the .zip file so it contains
Here are the full weblinks https://sift.bii.a-star.edu.sg/sift4g/public//Homo_sapiens/GRCh38.78/ https://sift.bii.a-star.edu.sg/sift4g/public//Homo_sapiens/GRCh38.83.chr/
I might have got it working. However, there are very few annotations for the RYR1 gene. I've attached my output file in this dropbox link. Is this what your output looks like as well? https://www.dropbox.com/home/SIFT%20annotator%20vcf%20file
Hi,
I'm unable to access your dropbox link. When I look at my results in more detail, I see that the missense substitutions are labeled, but there are no SIFT predictions. If you're just studying the RYR1 protein or a few proteins, please try submitting the protein sequence to
The link above uses the original SIFT algorithm (not SIFT 4G).
Thanks, Pauline
Hi,
Maybe this link will work.
I'm actually trying to annotate a few thousand proteins, so I would have to use the standalone.
I'm getting the same output as you, so maybe it's unique to the RYR1 protein. What if you annotate the other proteins, is it the same result or do you get prediction scores?
I tried passing through a 'whole genome' file of variants (basically a tab separated vcf file from the clinvar database) and none of these variants have a SIFT prediction either. It looks pretty much like the RYR1 protein. I feel like the format of the file is correct because when I run the commands, I can see how many variants were annotated and how many were not, but again, the end result is no prediction.
I checked the GRCh38 database manually and there are SIFT predictions. RYR1 does not have predictions. Assuming these are missense variants, some of your proteins should have predictions. The stats file shows me ~95% proteins are predicted on. There's nothing I can do at this point, sorry.
Okay, thank you for helping, I appreciate it. I will try and figure out a solution on my end. Should I come across one, I will post it here.
Hi,
I'm trying to use the SIFT4G annotator for my vcf file ( in the format requested). Once I feed it through the annotator and try to parse the results file, I only see "NA" in all the SIFT annotations. Any thoughts or suggestions would help!
Thanks!