Closed bricoletc closed 3 years ago
Logging here a validation evaluation of using minimap2
. Evaluation uses varifier on one staph dataset from martin (with truth genome assembled using pilon), :
I found that the 'good' minimap2 settings (-k9 -w4) give following recall/precision:
* require mapq > 40: "Precision_edit_dist": 0.99996956, "Recall_edit_dist": 0.50726812
* require mapq > 0: "Precision_edit_dist": 0.99998191, "Recall_edit_dist": 0.83881474,
* previous cortex, using stampy and mapq > 40, gets "Precision_edit_dist": 0.98863584, "Recall_edit_dist": 0.83029847,
on this dataset, minimap2 with mapq>0 does a little better in both metrics.
Logging some more validation work using 14 ilmn Plasmodium falciparum samples with matches pacb assemblies (in previous comment, ran on a Staphylococcus aureus dataset):
stampy: recall: 0.4490 precision: 0.9118
minimap2: recall: 0.4443 precision: 0.9258
These are the average of metrics computed using varifier
. THis confirms the change works fine.
hurrah
@iqbal-lab could you give me access to the cortex repo? so I could in future push code to a
minimap2
branch for eg. In the meantime this is for reviewing my changes.Code
run_calls.pl
always uses the option. Now, this always occurs, as --ref_fasta is used to place variants.Checks
I have validated that I get the same VCF when running
py_cortex_api
(it wraps cortex's independent workflow) with stampy and with minimap2, on cortex's demo files example1TODOs