GMOD / cram-js

Read CRAM v3 and v2 in node or in the browser
MIT License
18 stars 9 forks source link

Small optimization #109

Closed cmdcolin closed 2 years ago

cmdcolin commented 2 years ago

This PR makes cram parsing about 15% faster on shortread tests in the jb2profile suite, about 5-10% for longreads (that is for the end-to-end, so to make the entire test 15% faster, the cram parsing portion speedup could be greater 15% for example)

so, with 20x, 200x, 400x, 600x, 800x, 1000x shortreads

0.02 cram 20x
./profile.sh chr22_mask:124,000-134,000 20x.shortread.cram results/20x-shortread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=20x.shortread.cram" "results/20x-shortread-cram_fps_8001.json" "results/20x-shortread-cram_mem_8001.json"
  Time (mean ± σ):      4.799 s ±  0.095 s    [User: 0.447 s, System: 0.154 s]
  Range (min … max):    4.681 s …  4.940 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=20x.shortread.cram" "results/20x-shortread-cram_fps_8004.json" "results/20x-shortread-cram_mem_8004.json"
  Time (mean ± σ):      4.328 s ±  0.055 s    [User: 0.434 s, System: 0.124 s]
  Range (min … max):    4.248 s …  4.391 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=20x.shortread.cram" "results/20x-shortread-cram_fps_8004.json" "results/20x-shortread-cram_mem_8004.json"' ran
    1.11 ± 0.03 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=20x.shortread.cram" "results/20x-shortread-cram_fps_8001.json" "results/20x-shortread-cram_mem_8001.json"'

0.20 cram 200x
./profile.sh chr22_mask:124,000-134,000 200x.shortread.cram results/200x-shortread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=200x.shortread.cram" "results/200x-shortread-cram_fps_8001.json" "results/200x-shortread-cram_mem_8001.json"
  Time (mean ± σ):     10.395 s ±  0.295 s    [User: 0.515 s, System: 0.164 s]
  Range (min … max):   10.027 s … 10.751 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=200x.shortread.cram" "results/200x-shortread-cram_fps_8004.json" "results/200x-shortread-cram_mem_8004.json"
  Time (mean ± σ):      9.222 s ±  0.273 s    [User: 0.539 s, System: 0.149 s]
  Range (min … max):    8.839 s …  9.568 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=200x.shortread.cram" "results/200x-shortread-cram_fps_8004.json" "results/200x-shortread-cram_mem_8004.json"' ran
    1.13 ± 0.05 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=200x.shortread.cram" "results/200x-shortread-cram_fps_8001.json" "results/200x-shortread-cram_mem_8001.json"'

0.40 cram 400x
./profile.sh chr22_mask:124,000-134,000 400x.shortread.cram results/400x-shortread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=400x.shortread.cram" "results/400x-shortread-cram_fps_8001.json" "results/400x-shortread-cram_mem_8001.json"
  Time (mean ± σ):     15.666 s ±  0.331 s    [User: 0.576 s, System: 0.171 s]
  Range (min … max):   15.191 s … 16.003 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=400x.shortread.cram" "results/400x-shortread-cram_fps_8004.json" "results/400x-shortread-cram_mem_8004.json"
  Time (mean ± σ):     14.011 s ±  0.214 s    [User: 0.595 s, System: 0.180 s]
  Range (min … max):   13.667 s … 14.228 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=400x.shortread.cram" "results/400x-shortread-cram_fps_8004.json" "results/400x-shortread-cram_mem_8004.json"' ran
    1.12 ± 0.03 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=400x.shortread.cram" "results/400x-shortread-cram_fps_8001.json" "results/400x-shortread-cram_mem_8001.json"'

0.60 cram 600x
./profile.sh chr22_mask:124,000-134,000 600x.shortread.cram results/600x-shortread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=600x.shortread.cram" "results/600x-shortread-cram_fps_8001.json" "results/600x-shortread-cram_mem_8001.json"
  Time (mean ± σ):     22.709 s ±  0.814 s    [User: 0.619 s, System: 0.185 s]
  Range (min … max):   21.791 s … 23.768 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=600x.shortread.cram" "results/600x-shortread-cram_fps_8004.json" "results/600x-shortread-cram_mem_8004.json"
  Time (mean ± σ):     19.108 s ±  1.368 s    [User: 0.600 s, System: 0.177 s]
  Range (min … max):   18.061 s … 21.442 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=600x.shortread.cram" "results/600x-shortread-cram_fps_8004.json" "results/600x-shortread-cram_mem_8004.json"' ran
    1.19 ± 0.10 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=600x.shortread.cram" "results/600x-shortread-cram_fps_8001.json" "results/600x-shortread-cram_mem_8001.json"'

0.80 cram 800x
./profile.sh chr22_mask:124,000-134,000 800x.shortread.cram results/800x-shortread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=800x.shortread.cram" "results/800x-shortread-cram_fps_8001.json" "results/800x-shortread-cram_mem_8001.json"
  Time (mean ± σ):     27.237 s ±  1.082 s    [User: 0.605 s, System: 0.207 s]
  Range (min … max):   26.263 s … 28.928 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=800x.shortread.cram" "results/800x-shortread-cram_fps_8004.json" "results/800x-shortread-cram_mem_8004.json"
  Time (mean ± σ):     24.159 s ±  1.293 s    [User: 0.613 s, System: 0.206 s]
  Range (min … max):   22.766 s … 25.636 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=800x.shortread.cram" "results/800x-shortread-cram_fps_8004.json" "results/800x-shortread-cram_mem_8004.json"' ran
    1.13 ± 0.08 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=800x.shortread.cram" "results/800x-shortread-cram_fps_8001.json" "results/800x-shortread-cram_mem_8001.json"'

1 cram 1000x
./profile.sh chr22_mask:124,000-134,000 1000x.shortread.cram results/1000x-shortread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=1000x.shortread.cram" "results/1000x-shortread-cram_fps_8001.json" "results/1000x-shortread-cram_mem_8001.json"
  Time (mean ± σ):     32.612 s ±  0.891 s    [User: 0.692 s, System: 0.200 s]
  Range (min … max):   31.645 s … 33.536 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=1000x.shortread.cram" "results/1000x-shortread-cram_fps_8004.json" "results/1000x-shortread-cram_mem_8004.json"
  Time (mean ± σ):     28.240 s ±  1.316 s    [User: 0.699 s, System: 0.221 s]
  Range (min … max):   26.232 s … 29.876 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=1000x.shortread.cram" "results/1000x-shortread-cram_fps_8004.json" "results/1000x-shortread-cram_mem_8004.json"' ran
    1.15 ± 0.06 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=1000x.shortread.cram" "results/1000x-shortread-cram_fps_8001.json" "results/1000x-shortread-cram_mem_8001.json"'

with longreads

0.02 cram 20x
./profile.sh chr22_mask:124,000-134,000 20x.longread.cram results/20x-longread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=20x.longread.cram" "results/20x-longread-cram_fps_8001.json" "results/20x-longread-cram_mem_8001.json"
  Time (mean ± σ):      5.744 s ±  0.045 s    [User: 0.451 s, System: 0.163 s]
  Range (min … max):    5.675 s …  5.796 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=20x.longread.cram" "results/20x-longread-cram_fps_8004.json" "results/20x-longread-cram_mem_8004.json"
  Time (mean ± σ):      5.516 s ±  0.063 s    [User: 0.456 s, System: 0.184 s]
  Range (min … max):    5.404 s …  5.556 s    5 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=20x.longread.cram" "results/20x-longread-cram_fps_8004.json" "results/20x-longread-cram_mem_8004.json"' ran
    1.04 ± 0.01 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=20x.longread.cram" "results/20x-longread-cram_fps_8001.json" "results/20x-longread-cram_mem_8001.json"'

0.20 cram 200x
./profile.sh chr22_mask:124,000-134,000 200x.longread.cram results/200x-longread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=200x.longread.cram" "results/200x-longread-cram_fps_8001.json" "results/200x-longread-cram_mem_8001.json"
  Time (mean ± σ):     16.081 s ±  0.159 s    [User: 0.597 s, System: 0.230 s]
  Range (min … max):   15.876 s … 16.256 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=200x.longread.cram" "results/200x-longread-cram_fps_8004.json" "results/200x-longread-cram_mem_8004.json"
  Time (mean ± σ):     15.331 s ±  0.416 s    [User: 0.556 s, System: 0.213 s]
  Range (min … max):   14.690 s … 15.686 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=200x.longread.cram" "results/200x-longread-cram_fps_8004.json" "results/200x-longread-cram_mem_8004.json"' ran
    1.05 ± 0.03 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=200x.longread.cram" "results/200x-longread-cram_fps_8001.json" "results/200x-longread-cram_mem_8001.json"'

0.40 cram 400x
./profile.sh chr22_mask:124,000-134,000 400x.longread.cram results/400x-longread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=400x.longread.cram" "results/400x-longread-cram_fps_8001.json" "results/400x-longread-cram_mem_8001.json"
  Time (mean ± σ):     26.073 s ±  1.061 s    [User: 0.607 s, System: 0.248 s]
  Range (min … max):   25.075 s … 27.643 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=400x.longread.cram" "results/400x-longread-cram_fps_8004.json" "results/400x-longread-cram_mem_8004.json"
  Time (mean ± σ):     24.245 s ±  0.830 s    [User: 0.608 s, System: 0.217 s]
  Range (min … max):   23.003 s … 25.156 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=400x.longread.cram" "results/400x-longread-cram_fps_8004.json" "results/400x-longread-cram_mem_8004.json"' ran
    1.08 ± 0.06 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=400x.longread.cram" "results/400x-longread-cram_fps_8001.json" "results/400x-longread-cram_mem_8001.json"'

0.60 cram 600x
./profile.sh chr22_mask:124,000-134,000 600x.longread.cram results/600x-longread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=600x.longread.cram" "results/600x-longread-cram_fps_8001.json" "results/600x-longread-cram_mem_8001.json"
  Time (mean ± σ):     36.082 s ±  0.906 s    [User: 0.651 s, System: 0.277 s]
  Range (min … max):   35.126 s … 37.209 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=600x.longread.cram" "results/600x-longread-cram_fps_8004.json" "results/600x-longread-cram_mem_8004.json"
  Time (mean ± σ):     33.503 s ±  0.752 s    [User: 0.673 s, System: 0.268 s]
  Range (min … max):   32.387 s … 34.415 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=600x.longread.cram" "results/600x-longread-cram_fps_8004.json" "results/600x-longread-cram_mem_8004.json"' ran
    1.08 ± 0.04 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=600x.longread.cram" "results/600x-longread-cram_fps_8001.json" "results/600x-longread-cram_mem_8001.json"'

0.80 cram 800x
./profile.sh chr22_mask:124,000-134,000 800x.longread.cram results/800x-longread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=800x.longread.cram" "results/800x-longread-cram_fps_8001.json" "results/800x-longread-cram_mem_8001.json"
  Time (mean ± σ):     47.426 s ±  0.659 s    [User: 0.708 s, System: 0.325 s]
  Range (min … max):   46.572 s … 48.385 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=800x.longread.cram" "results/800x-longread-cram_fps_8004.json" "results/800x-longread-cram_mem_8004.json"
  Time (mean ± σ):     41.976 s ±  1.387 s    [User: 0.748 s, System: 0.286 s]
  Range (min … max):   40.099 s … 43.651 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=800x.longread.cram" "results/800x-longread-cram_fps_8004.json" "results/800x-longread-cram_mem_8004.json"' ran
    1.13 ± 0.04 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=800x.longread.cram" "results/800x-longread-cram_fps_8001.json" "results/800x-longread-cram_mem_8001.json"'

1 cram 1000x
./profile.sh chr22_mask:124,000-134,000 1000x.longread.cram results/1000x-longread-cram hg19mod
Benchmark 1: node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=1000x.longread.cram" "results/1000x-longread-cram_fps_8001.json" "results/1000x-longread-cram_mem_8001.json"
  Time (mean ± σ):     54.793 s ±  1.275 s    [User: 0.764 s, System: 0.330 s]
  Range (min … max):   53.128 s … 56.579 s    5 runs

Benchmark 2: node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=1000x.longread.cram" "results/1000x-longread-cram_fps_8004.json" "results/1000x-longread-cram_mem_8004.json"
  Time (mean ± σ):     50.722 s ±  0.674 s    [User: 0.761 s, System: 0.345 s]
  Range (min … max):   49.779 s … 51.536 s    5 runs

Summary
  'node profile_jb2web.js "http://localhost:8004/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=1000x.longread.cram" "results/1000x-longread-cram_fps_8004.json" "results/1000x-longread-cram_mem_8004.json"' ran
    1.08 ± 0.03 times faster than 'node profile_jb2web.js "http://localhost:8001/?loc=chr22_mask:124,000-134,000&assembly=hg19mod&tracks=1000x.longread.cram" "results/1000x-longread-cram_fps_8001.json" "results/1000x-longread-cram_mem_8001.json"'

overall, not a gigantic speedup but could maybe get more over time. note: uses typedarrays primarily instead of buffer. this typedarray usage may not work on a bigendian machines (typedarray uses native endianness), but big endian machines are exceedingly rare.

codecov[bot] commented 2 years ago

Codecov Report

Merging #109 (fb54095) into master (32ae792) will decrease coverage by 0.17%. The diff coverage is 85.13%.

@@            Coverage Diff             @@
##           master     #109      +/-   ##
==========================================
- Coverage   86.24%   86.07%   -0.18%     
==========================================
  Files          41       41              
  Lines        2014     2032      +18     
  Branches      411      415       +4     
==========================================
+ Hits         1737     1749      +12     
- Misses        248      252       +4     
- Partials       29       31       +2     
Impacted Files Coverage Δ
src/cramFile/file.js 81.72% <54.54%> (-1.71%) :arrow_down:
src/cramFile/slice/decodeRecord.js 92.18% <87.75%> (-0.29%) :arrow_down:
src/cramFile/codecs/byteArrayLength.js 95.83% <100.00%> (ø)
src/cramFile/codecs/byteArrayStop.js 86.20% <100.00%> (-0.46%) :arrow_down:
src/cramFile/codecs/external.js 85.18% <100.00%> (-1.03%) :arrow_down:
src/cramFile/container/compressionScheme.js 98.43% <100.00%> (+0.05%) :arrow_up:
src/cramFile/slice/index.js 87.14% <100.00%> (+0.12%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 32ae792...fb54095. Read the comment docs.