lh3 / bfc

High-performance error correction for Illumina resequencing data
MIT License
68 stars 13 forks source link

Read counter not incrementing #3

Closed aihardin closed 9 years ago

aihardin commented 9 years ago

I'm running bfc on my full set now (1.4 billion reads) and the log is giving me a sequence count that is oscillating. Is this the expected behavior?

bfc -s 2.9g -k 55 -t 31 all_reads.fastq.gz |gzip >all_reads.cor.fastq.gz

[M::bfc_count_cb @4667.7_1946.9%] processed 809616 sequences; # distinct k-mers: 2635769628 [M::bfc_count_cb] read 809443 sequences [M::bfc_count_cb @4670.6_1947.4%] processed 809443 sequences; # distinct k-mers: 2636069642 [M::bfc_count_cb] read 809500 sequences [M::bfc_count_cb @4673.9_1947.7%] processed 809500 sequences; # distinct k-mers: 2636370228 [M::bfc_count_cb] read 809470 sequences [M::bfc_count_cb @4676.6_1948.2%] processed 809470 sequences; # distinct k-mers: 2636673453 [M::bfc_count_cb] read 809400 sequences [M::bfc_count_cb @4679.5*1948.6%] processed 809400 sequences; # distinct k-mers: 2636970314

lh3 commented 9 years ago

This just means that your read file has this oscillation. Nothing wrong with BFC.

aihardin commented 9 years ago

Sorry if I'm misunderstanding, does the processed N sequences not refer to the number of reads processed?

aihardin commented 9 years ago

Ah, that's reads per chunk, not a running total like the k-mer count.