HuntsmanCancerInstitute / USeq

180+ Java applications for analyzing next generation sequencing data from ChIPSeq, RNASeq, BisSeq, DNASeq, variant annotation/ filtering, alignment/VCF QC, capture array design, IGV/ DAS2/IGB/UCSC file manipulation, etc. Both GUI and cmd line interfaces.
http://bioserver.hci.utah.edu/USeq/Documentation/
17 stars 4 forks source link

VCFComparator ROC input #3

Closed patidarr closed 6 years ago

patidarr commented 6 years ago

Hi,

I remember in the past there used to be output for range of QUAL threshold. but I ran the latest version of the tool and I only get none in QUALThreshold column.

Could you let me know if I am doing something wrong here. Also I wanted to make ROC for varying coverage, is that possible?

Thanks, Rajesh

[5 Dec 2017 15:12] USeq_8.9.5 Arguments: -a /data/MoCha/patidarr/Validation/ref/expected_XX_100-25.vcf -b /data/MoCha/patidarr/Validation/ref/XX-100-25.bed -c /data/MoCha/processedDATA/Control/20170910/HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX/calls/HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX.HC_DNASeq.raw.vcf -d /data/MoCha/Reference/SS_v5_Qualimap_PADDED_NoChr.bed -s -p XX-100-25_HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX

VCF Comparator Settings:

expected_XX_100-25.vcf  Key vcf file
XX-100-25.bed   Key interrogated regions file
HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX.HC_DNASeq.raw.vcf    Test vcf file
SS_v5_Qualimap_PADDED_NoChr.bed Test interrogated regions file
XX-100-25_HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX Save directory for parsed datasets
true    Require matching alternate bases
false   Require matching genotypes
**false   Use record VQSLOD score as ranking statistic**
false   Exclude non PASS or . records
true    Compare SNPs, not non-SNP variants

Parsing and filtering variant data for common interrogated regions...
Comparing calls...

Done! 57 seconds
DavidAustinNix commented 6 years ago

Hmm? Yes, set -v and you should be good to go. Fire the app without any args to see the menu of options.

Regarding a ROC for variants with diff coverage, yes, split your VCF by read depth, say bin it. Then run the VCFComparator on all of then.

-cheers, David

On Dec 5, 2017, at 1:38 PM, Rajesh Patidar notifications@github.com wrote:

Hi,

I remember in the past there used to be output for range of QUAL threshold. but I ran the latest version of the tool and I only get none in QUALThreshold column.

Could you let me know if I am doing something wrong here. Also I wanted to make ROC for varying coverage, is that possible?

Thanks, Rajesh

[5 Dec 2017 15:12] USeq_8.9.5 Arguments: -a /data/MoCha/patidarr/Validation/ref/expected_XX_100-25.vcf -b /data/MoCha/patidarr/Validation/ref/XX-100-25.bed -c /data/MoCha/processedDATA/Control/20170910/HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX/calls/HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX.HC_DNASeq.raw.vcf -d /data/MoCha/Reference/SS_v5_Qualimap_PADDED_NoChr.bed -s -p XX-100-25_HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX

VCF Comparator Settings:

expected_XX_100-25.vcf Key vcf file XX-100-25.bed Key interrogated regions file HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX.HC_DNASeq.raw.vcf Test vcf file SS_v5_Qualimap_PADDED_NoChr.bed Test interrogated regions file XX-100-25_HAPMAP-CONTROL-POOL-XX-1_CBBMDANXX Save directory for parsed datasets true Require matching alternate bases false Require matching genotypes false Use record VQSLOD score as ranking statistic false Exclude non PASS or . records true Compare SNPs, not non-SNP variants

Parsing and filtering variant data for common interrogated regions... Comparing calls...

Done! 57 seconds — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/HuntsmanCancerInstitute/USeq/issues/3, or mute the thread https://github.com/notifications/unsubscribe-auth/ALVfDcvAx_b5FLQG3dX-4rEFlvDNPDV0ks5s9anPgaJpZM4Q27YH.

DavidAustinNix commented 6 years ago

You could also substitute the QUAL score with the read depth. Whatever is in the QUAL value column is used to rank the variants and perform the set analysis.