mbhall88 / rasusa

Randomly subsample sequencing reads or alignments
https://doi.org/10.21105/joss.03941
MIT License
203 stars 17 forks source link

Change fastx parsing library to needletail #23

Closed mbhall88 closed 3 years ago

mbhall88 commented 3 years ago

Changed

Benchmark

Changing the fastx parsing library leads to an improvement in speed over v0.3.0

Uncompressed fastq

1.48 times faster

Benchmark #1: ./rasusav030 -i tb.fq -c 50 -g 4411532 -s 1
  Time (mean ± σ):      1.645 s ±  0.062 s    [User: 826.0 ms, System: 818.7 ms]
  Range (min … max):    1.582 s …  1.757 s    10 runs

Benchmark #2: ../target/release/rasusa -i tb.fq -c 50 -g 4411532 -s 1
  Time (mean ± σ):      1.109 s ±  0.043 s    [User: 314.0 ms, System: 793.7 ms]
  Range (min … max):    1.076 s …  1.202 s    10 runs

Summary
  '../target/release/rasusa -i tb.fq -c 50 -g 4411532 -s 1' ran
    1.48 ± 0.08 times faster than './rasusav030 -i tb.fq -c 50 -g 4411532 -s 1'

Compressed fastq

1.10 times faster

Benchmark #1: ./rasusav030 -i tb.fq.gz -c 50 -g 4411532 -s 1
  Time (mean ± σ):     33.480 s ±  3.594 s    [User: 32.924 s, System: 0.550 s]
  Range (min … max):   31.258 s … 41.326 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark #2: ../target/release/rasusa -i tb.fq.gz -c 50 -g 4411532 -s 1
  Time (mean ± σ):     30.320 s ±  1.382 s    [User: 29.872 s, System: 0.440 s]
  Range (min … max):   29.316 s … 33.215 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  '../target/release/rasusa -i tb.fq.gz -c 50 -g 4411532 -s 1' ran
    1.10 ± 0.13 times faster than './rasusav030 -i tb.fq.gz -c 50 -g 4411532 -s 1'