jmonlong / sveval

Functions to compare a SV call sets against a truth set.
MIT License
25 stars 6 forks source link

Crash when comparing inversions after normalization. #4

Closed glennhickey closed 5 years ago

glennhickey commented 5 years ago

I get a crash when trying to compare HG00514 as called by vg with the calls from the SMRT-SV paper. Without normalization, I get the following:

rm -rf js ; toil-vg vcfeval ./js ./os-exp-norm --vcfeval_baseline s3://glennhickey/share/sveval_inv/svpop-explicit/sv-pop-explicit_HG00514.vcf.gz --call_vcf s3://glennhickey/outstore/SVPOP-jan10/call-HG00514/HG00514.vcf.gz  --realTimeLogging  --sveval   --workDir ./temp/  --retryCount 0 
type    TP  TP.baseline FP  FN  precision   recall  F1
Total   8932    8946    6621    15153   0.5743  0.3712  0.4509
INS 7742    7719    5857    6383    0.5693  0.5474  0.5581
DEL 1186    1223    760 8707    0.6095  0.1232  0.205
INV 4   4   4   63  0.5 0.0597  0.1067

But when I add --normalize --force_outstore --vcfeval_fasta ~/dev/work/references/hg38.fa.gz I get

ERROR:toil-rt:Docker container for command R -f sveval.R failed with code 1
ERROR:toil-rt:Dumping stderr...
ERROR:toil-rt:Error in `[[<-`(`*tmp*`, name, value = NA) : 
ERROR:toil-rt:  1 elements in value to replace 0 elements
ERROR:toil-rt:Calls: <Anonymous> -> readSVvcf -> $<- -> $<- -> [[<- -> [[<-
ERROR:toil-rt:Execution halted
ERROR:toil-rt:Dumping stdout...
ERROR:toil-rt:
ERROR:toil-rt:R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
ERROR:toil-rt:Copyright (C) 2018 The R Foundation for Statistical Computing
ERROR:toil-rt:Platform: x86_64-pc-linux-gnu (64-bit)
ERROR:toil-rt:
ERROR:toil-rt:R is free software and comes with ABSOLUTELY NO WARRANTY.
ERROR:toil-rt:You are welcome to redistribute it under certain conditions.
ERROR:toil-rt:Type 'license()' or 'licence()' for distribution details.
ERROR:toil-rt:
ERROR:toil-rt:R is a collaborative project with many contributors.
ERROR:toil-rt:Type 'contributors()' for more information and
ERROR:toil-rt:'citation()' on how to cite R or R packages in publications.
ERROR:toil-rt:
ERROR:toil-rt:Type 'demo()' for some demos, 'help()' for on-line help, or
ERROR:toil-rt:'help.start()' for an HTML browser interface to help.
ERROR:toil-rt:Type 'q()' to quit R.
ERROR:toil-rt:
ERROR:toil-rt:> sveval::svevalOl("calls-norm.vcf.gz", "truth-norm.vcf.gz", outfile="sv_accuracy.tsv", min.cov=0.5, out.bed.prefix="", min.size=20, max.ins.dist=20, min.del.rol=0.1)
jmonlong commented 5 years ago

After normalization the truth set is empty which causes the error in sveval. I'll add an explicit error message in sveval.

The normalized baseline is empty because s3://glennhickey/share/sveval_inv/svpop-explicit/sv-pop-explicit_HG00514.vcf.gz contains only haploid calls, which are filtered out before normalization.

glennhickey commented 5 years ago

That would do it, thanks! That's the Cell paper discovery set, which is my last hope of a "real" inversion comparison. Hoping that a combination of the newly-fixed vg as well as not filtering out the entire truth set, will finally produce some results.