Closed biozzq closed 5 years ago
Dear Zhuqing, The genotype of a variant is determined by the number of reads that support the reference or alternativ allele. In your case there seems to be only 1 read, which is also shown by the screenshot. Thus, the report from Sniffles is expected. Thanks for reaching out. Cheers Fritz
Dear Fritz,
Thank you. However, I think the right genotype here should be 0/1
. There are two reads here, one supports the reference allele and one supports the deletion.
Best,
Zhuqing
Thanks. I did not see the reference allele read. Still this is not enough information. You see these methods are implement to work on a couple of reads including sequencing errors. We had a user before reporting in with simulated data were the problem is that there were no sequencing errors on the read.
So anyways. What I would suggest is to simulate at least 5 reads with sequencing error included. You can do that for example with SURVIVOR or a different method. Please let me know if this was the problem. I am happy to help. Thanks Fritz
Dear @fritzsedlazeck
Thank you. I want to detect the structure variations between different assemblies, so the assembly data will not involve many sequencing errors. Just out of interest, why do sniffles
expect sequencing errors on the reads? As we know, the best sequencing data should not involve many sequencing errors, so it would be reasonable to take all the reads into consideration during genotyping.
Sincerely,
Zhuqing
It runs a calibration in the beginning and that can get confused. It assumes noisy data since it was designed for pacbio and ont reads. Here it tries to also interpret regions of reads that show an abnormal high error rate. These regions can indicate potential Svs that weren't mapped out nicely by the aligner used.
I will put it on my list to see if I can improve things on the assembly side as well. Thanks Fritz
Dear Fritx, what does "REF_strand" means in the above sniffles output ?
This is currently only a test for myself that I kept. I need to further define the code before I would recommend to use it. Thanks Fritz
Dear Fritz,
I am working on a haploid cell. In my case is the SV genotyping relevant?
ignore the gentotype. Thats calibrated for diploid. otherwise it should be good. Best Fritz
Sounds good thanks! Can I though try to interpret it as: only a portion of the reads contain the SV (when 0/1; or all reads when 1/1) or it just does not mean anything?
Dear all,
I simulated a small data set to learn how to use
Sniffles
. This data contains a heterzygous deletion. However, after ran following command, the genotype for this variant is1/1
. In my mind, it should be0/1
.sniffles (Version: 1.0.10) -m query.bam --min_seq_size 1 -s 1 -v query.vcf --skip_parameter_estimation --genotype
1 535 0 N <DEL> . PASS PRECISE;SVMETHOD=Snifflesv1.0.10;CHR2=1;END=1639;STD_quant_start=0.000000;STD_quant_stop=0.000000;Kurtosis_quant_start=-nan;Kurtosis_quant_stop=-nan;SVTYPE=DEL;SUPTYPE=SR;SVLEN=-1104;STRANDS=+-;RE=1;REF_strand=0,0;AF=1 GT:DR:DV 1/1:0:1
Best wishes, Zhuqing