lh3 / bfc

High-performance error correction for Illumina resequencing data
MIT License
68 stars 13 forks source link

code for arXiv evaluation table #5

Closed macmanes closed 9 years ago

macmanes commented 9 years ago

Heng,

Are you willing to share the code you used to evaluate the different error correction algorithms, e.g. Table 1 in your arXiv manuscript?

Matt

lh3 commented 9 years ago

That is errstat.js script in this repo. You need k8 to run this script. The binary can be found at biobin or from bwakit.

Usage: k8 errstat.js <in.unsrt.sam.gz> [to-cmp.unsrt.sam.gz]

The script is tuned for bwa-mem, though other mappers outputting the "NM" tag should work in principle.

EDIT: for numbers in the table, to-cmp.unsrt.sam.gz is the unsorted alignment of uncorrected reads and in.unsrt.sam.gz is the alignment of corrected reads.

EDIT2: for the command lines, see tex/README.md

macmanes commented 9 years ago

up ad running: turns out bfc is pretty good with RNAseq error correction as well, at least with some preliminary tests.

lh3 commented 9 years ago

I have zero experience with RNA-seq assembly. It is good to know that bfc even actually works. I am not closing this issue. Thank you.