mroosmalen / nanosv

SV caller for nanopore data
MIT License
89 stars 22 forks source link

NanoSV with truvari #61

Open Tintest opened 4 years ago

Tintest commented 4 years ago

Hello, I'm trying to benchmark NIST002 vcf files produced by NanoSV against the GIAB NIST002 SV truth set, but not any variant is classfied as true positive.

Here is the logs :

2019-11-20 12:09:19,348 [INFO] Running /home/tintest/miniconda3/bin/truvari -b ../HG002_SVs_Tier1_v0.6.vcf.gz -c /home/tintest/bettik/SV/nanopore/Biomnis/vcf/nanosv/NIST-002_merged_ngmlr_sorted.vcf.gz -o NIST-002_merged_ngmlr_sorted -r 2000 --pctsim 0 --passonly --includebed ../HG002_SVs_Tier1_v0.6.bed --giabreport
2019-11-20 12:09:19,348 [INFO] Params:
{
    "sizemax": 50000,
    "reference": null,
    "noprog": false,
    "multimatch": false,
    "pctsize": 0.7,
    "cSample": null,
    "includebed": "../HG002_SVs_Tier1_v0.6.bed",
    "no_ref": false,
    "passonly": true,
    "pctsim": 0.0,
    "pctovl": 0.0,
    "comp": "/home/tintest/bettik/SV/nanopore/Biomnis/vcf/nanosv/NIST-002_merged_ngmlr_sorted.vcf.gz",
    "refdist": 2000,
    "base": "../HG002_SVs_Tier1_v0.6.vcf.gz",
    "giabreport": true,
    "sizefilt": 30,
    "typeignore": false,
    "gtcomp": false,
    "debug": false,
    "output": "NIST-002_merged_ngmlr_sorted",
    "bSample": null,
    "sizemin": 50
}
2019-11-20 12:09:20,646 [INFO] Including 34830 bed regions
2019-11-20 12:09:20,646 [INFO] Creating call interval tree for overlap search
2019-11-20 12:09:30,717 [INFO] 35145 call variants in total
2019-11-20 12:09:30,717 [INFO] 0 call variants within size range (30, 50000)
2019-11-20 12:09:49,403 [INFO] 20041 base variants
2019-11-20 12:09:49,423 [INFO] Matching base to calls
2019-11-20 12:10:09,183 [WARNING] No TP or FP calls in base!
2019-11-20 12:10:09,183 [INFO] Parsing FPs from calls
2019-11-20 12:10:18,714 [INFO] Stats: {
    "TP-base": 0,
    "TP-call": 0,
    "FP": 0,
    "FN": 9641,
    "precision": 0,
    "recall": 0,
    "f1": "NaN",
    "base cnt": 9641,
    "call cnt": 0,
    "base size filtered": 6309,
    "call size filtered": 0,
    "base gt filtered": 0,
    "call gt filtered": 0,
    "TP-call_TP-gt": 0,
    "TP-call_FP-gt": 0,
    "TP-base_TP-gt": 0,
    "TP-base_FP-gt": 0,
    "gt_precision": 0,
    "gt_recall": 0,
    "gt_f1": "NaN"
}
2019-11-20 12:10:18,715 [INFO] Creating GIAB report
2019-11-20 12:10:20,919 [INFO] Finished

Did you ever tried truvari with a vcf from NanoSV ?

It works flawlessly with SV callers such as Svim, Pbsv or Sniffles.

Here is the link to the vcf : https://filesender.renater.fr/?s=download&token=8905688b-e98a-859c-c841-ad7e9088a2c6

Here is the NanoSV command to produce the vcf file from a bam produced by ngmlr :

singularity exec -B /bettik/tintest/:/mnt /home/tintest/bettik/SV/nanopore/Chaissonetal2019/nanosv.simg NanoSV --bed /mnt/SV/nanopore/human_hg19.bed -t 4 -s samtools /mnt/SV/nanopore/Biomnis/bam/ngmlr/NIST-002_merged_ngmlr.bam -o /mnt/SV/nanopore/Biomnis/vcf/nanosv/NIST-002_merged_ngmlr.vcf

Same problem with NanoSV vcf from bam files produced by minimap2.

I guess the problem must come from the NanoSV vcf format.

Regards.

yekaizhou commented 3 years ago

I came with the same issue, that Truvari detected no SV within size range

LYC-vio commented 2 years ago

you may need to remove the --passonly flag (since NanoSV does not use PASS in vcf's FILTER field), also, you need to change RT=3 to RT=. (or RT=1) in the vcf header

LYC-vio commented 2 years ago

And change ##FILTER=<ID=Gap to ##FILTER=<ID=GAP