Closed ronaldosfj closed 5 years ago
Hi Ronaldo,
you need to use VCF for the WASP option - to tell STAR where the variants are. The VCF file needs to have the 10th column with genotype recorded as 0/1, 1/0, 1/1 (or | instead of /). Personal VCF files are supposed to have this column. If you are using a general VCF file from a database, you can add the 0/1 genotype for all SNPs.
Cheers Alex
Hi, Thanks for answering me!
How could I add the genotype field to my general VCF file? Do you have any suggestion?
Cheers, Ronaldo
Hi Ronaldo,
you would need to add (or replace) the 10th column in the VCF file with 0/1, for every SNP
Cheers Alex
Great! Thanks again!
Hi @alexdobin, I came across this question and it's very relevant to my current situation. I would like to use STAR with --waspOutputMode to map 400ish samples, but I do not have genotypes for all 400, probably just for ~380. Looking at the personal VCFs of these genotypes samples, it looks like the variant calling was done for all samples at once, and all vcfs have the same variants regardless of their GT state (0/0, 0/1, 1/1). So my question is the following: If I decide to go forward with a generalized VCF where the 10th column is 0/1, and decide to map all 400 samples, would it make a huge difference?
Will I loose reads for the sites that are homozygous in some samples if all variants in the --varVCFfile are 0/1?
Thanks in advance, Marliette
Hi @matosmr
It should work fine, as GT=0/0 (reference genotype) should not affect the mapping. But I would recommend removing GT=0/0 from each of the personal VCFs.
Hi @alexdobin ! Thank you so much for the new tool (STAR + WASP). I tried using the tool and read that I need to add genotype information. However, I don't have this information because I only have RNA-Seq data. So, should I first align with STAR and call variants, and then use STAR again with the SNP file containing genotype information along with WASP?
Hi Alex,
Thank you very much for the STAR-WASP implementation!
I am trying to perform an allele-specific expression analysis using RNA-Seq data. Therefore, in order to control allelic biases in my data, I am using the wasp parameter.
Here are my doubts: 1) Is it possible to use the --waspOutputMode without the --varVCFfile? I am asking because every time I run --waspOutputMode parameter STAR shows me an error message and suggests the inclusion of the --varVCFfile parameter.
2) Can I use VCF files from a database (such as dbsnp) in --varVCFfile parameter? In case I always need to use the parameter --varVCFfile together with --waspOutputMode, is it possible to use a vcf from a database?
In fact, I already tried to used dbSNP's VCF, however, STAR did not recognize it. Then, I used the vcf file from the sample I am trying to perform the alignment and it worked fine! If I understand correctly, do I need to perform a previous alignment and variant calling without this parameter, after that perform the alignment again?
Best regards, Ronaldo