ACEnglish / truvari

Structural variant toolkit for VCFs
MIT License
313 stars 48 forks source link

No FP or TP calls #203

Closed jeplb closed 6 months ago

jeplb commented 6 months ago

Version : v4.2.2

Describe the bug : I'm trying to benchmark Nanopore reads against the NIST v0.6 Tier 1 truth set but I'm getting the following error:

2024-03-28 09:58:39,498 [INFO] Including 34830 bed regions
2024-03-28 09:58:45,504 [INFO] Zipped 20041 variants Counter({'base': 20041})
2024-03-28 09:58:45,505 [INFO] 9655 chunks of 20041 variants Counter({'__filtered': 10400, 'base': 9641})
2024-03-28 09:58:47,681 [WARNING] No TP or FP calls in comp!
2024-03-28 09:58:47,709 [INFO] Stats: {
    "TP-base": 0,
    "TP-comp": 0,
    "FP": 0,
    "FN": 9641,
    "precision": null,
    "recall": null,
    "f1": null,
    "base cnt": 9641,
    "comp cnt": 0,
    "TP-comp_TP-gt": 0,
    "TP-comp_FP-gt": 0,
    "TP-base_TP-gt": 0,
    "TP-base_FP-gt": 0,
    "gt_concordance": 0,
    "gt_matrix": {}
}

To Reproduce : I've ran truvari bench -b 00_ref/HG002_SVs_Tier1_v0.6.vcf.gz -c 06_sv/CD_3032_Cache.hg19.sv.vcf.gz -o 07_truvari/HG002_3032.truvari -f 00_ref/hg19.fa --includebed 00_ref/HG002_SVs_Tier1_v0.6.bed --passonly with and without the passonly tag and getting the same results.

I was successful in using HG002_GRCh38_CMRG_SV_v1.00.vcf.gz to test. So I'm not sure where the issue could be.

Additional details

ACEnglish commented 6 months ago

From the information provided my only assumption is that 06_sv/CD_3032_Cache.hg19.sv.vcf.gz is empty. The line 9655 chunks of 20041 variants Counter({'__filtered': 10400, 'base': 9641}) says that there are 20041 variants analyzed and the line Zipped 20041 variants Counter({'base': 20041}) says that all the variants are in the baseline vcf. If there were comparison variants, the lines would say something like Counter({'comp': 12345....

Have you checked the comparison vcf?

jeplb commented 6 months ago

It's definitely not empty. I figured it out. It turns out I was using the wrong reference, that is, I was using hg19 from UCSC. I used the one from the NIST website directly and everything worked.