fritzsedlazeck / Spectre

Copy number caller for long read data including SNV utilization
MIT License
52 stars 3 forks source link

Question about QUAL, GQ, FILTER and SVSUPPORT #20

Closed JakeHagen closed 5 months ago

JakeHagen commented 5 months ago

Hello

The results I get from spectre have many CNVs but all of them have a QUAL=".", GQ=0, FILTER=".", and SVSUPPORT=FALSE. Are these just low quality calls? Would a high quality set all of these fields?

philippesanio commented 5 months ago

Hi @JakeHagen

My apologies, for GQ this should not be 0. I will push an update on that regard asap.

For CNVs we are not using QUAL and FILTER, thus they are set to their default state ".". However, when using Spectre in the LOH mode (if an SNV file is provided) we will make use of them.

The INFO flag SVSUPPORT is only used in the context of using SNFJ files which are based on the SNF files from Sniffles. The SNFJ file is used to search for additional evidence in the breakpoints in the Sniffles output. Thus, it is more like an additional filter flag and not a quality filter per se. Please note, this feature is experimental.

I hope this helped. Cheers, Philippe

philippesanio commented 5 months ago

Hi @JakeHagen

I have now updated the code so that the GQ is now reported properly. Additionally, the SVSUPPORT INFO flag is now only visible if used in the context of an SNFJ file.

Cheers Philippe

JakeHagen commented 5 months ago

Awesome, thank you. I will give the newest commit a try and report back

JakeHagen commented 5 months ago

Thanks @philippesanio, I now get GQ values. Also thanks for the clarification on the other fields

selmapichot commented 4 months ago

Hi, I also have a question on GQ and HO. In the VCF I got, All GQ =60 and all HO = 0 (even if there are CN ranging from 0 to 6. Is there a preset filter or threshold that is behind these values ? Many thanks.

S.

philippesanio commented 4 months ago

Hi @selmapichot

The GQ is a combination of two scores: The Z-score, which is used as input for the Phred-like quality scores (Q-scores).

With the Z-score, we just want to determine if the called variant is an actual variant based on Spectre observation. Since it is not common to use the Z-score in VCFs we use the resulting value as input for the Q-score. A GQ score of 60 would indicate that Spectre is 99.9999% sure that its call is an actual variant call. Even though, it would be possible to reach a higher score, we have capped the GQ score like many other tools at 60 since the chance of an error would be 1 in 1,000,000. https://en.wikipedia.org/wiki/Phred_quality_score

HO is only used in the combination of the LOH mode of the CNVCaller. The LOH mode is activated when providing a SNV VCF file. Please note, this feature is still experimental and is subject to change in a later version of Spectre. For CNVs HO is set to 0 as those are based only on the coverage signal.

I hope this helps. If you any further questions feel free to ping me any time or open a new issue.

Cheers, Philippe

selmapichot commented 4 months ago

Many thanks philippe for your reply. Great tool :)

selmapichot commented 4 months ago

Hi Philippe, Is there a way to get the minor allele copy number with Spectre ?

Many thanks, S