varfish-org / mehari

VEP-like tool for sequence ontology and HGVS annotation of VCF files
MIT License
16 stars 1 forks source link

Profile mehari #430

Open xiamaz opened 6 months ago

xiamaz commented 6 months ago

Is your feature request related to a problem? Please describe. We need to understand, which code paths influence mehari performance.

Describe the solution you'd like Use flamegraph, DHAT to understand code paths and allocations. Add a branch with coz support. Look into https://crates.io/crates/criterion.

Describe alternatives you've considered None

Additional context This is the basis for additional work.

tedil commented 5 months ago

profiling

Profiling was done with an explicit profiling profile:

[profile.profiling]
inherits = "release"
debug = true

on a machine with the following specs

CPU: quad core Intel Xeon E5-1630 v3 (-MT MCP-)
speed/min/max: 1621/1200/3800 MHz Kernel: 6.5.0-28-generic x86_64
Mem: 9386.1/64180.0 MiB (14.6%) Storage: 685.61 GiB (26.7% used) Procs: 337
Drives:
  Local Storage: total: 685.61 GiB used: 182.78 GiB (26.7%)
  ID-1: /dev/sda vendor: Samsung model: MZ7LN256HCHP-00000 size: 238.47 GiB  # hosts /, /home
  ID-2: /dev/sdb model: MZ7KH480HAHQ0D3 size: 447.13 GiB  # hosts mehari DB

flamegraph

Only uses the first 1M records from the input file (NA-12878WGS_dragen.vcf.gz)

invocation

cargo flamegraph --profile profiling --bin mehari -- annotate seqvars --path-db /mnt/data/mehari/0.21.0/db --path-input-vcf tests/data/annotate/seqvars/NA-12878WGS_dragen.first1M.vcf.gz --path-output-vcf /tmp/NA-12878WGS_dragen.first1M.annotated.vcf.gz --path-input-ped data/FAM_BE_10.sup.ped

result

flamegraph (click here for the interactive version)

dhat

Only uses the first 100k records from the input file (NA-12878WGS_dragen.vcf.gz)

invocation

cargo run --profile profiling --features dhat-heap --bin mehari -- annotate seqvars --path-db /mnt/data/mehari/0.21.0/db --path-input-vcf tests/data/annotate/seqvars/NA-12878WGS_dragen.first100k.vcf.gz --path-output-vcf /tmp/NA-12878WGS_dragen.first100k.annotated.vcf.gz --path-input-ped data/FAM_BE_10.sup.ped

result

dhat-heap.json (view in dhat-viewer)

coz

Figure out reasonable scopes before running coz.

tedil commented 5 months ago

At the moment, my reading of this is roughly:

tedil commented 5 months ago

caching build_alignment_mapper: 103s/1M records → 71s/1M records

tedil commented 4 months ago

Setting lto = "fat" and codegen-units = 1 shaves another ~7s off the 1M records timing. (64s/1M records)

tedil commented 4 months ago

Using noodles Async{Reader, Writer} shaves another ~10s off the 1M records timing. (54s/1M records)

holtgrewe commented 4 months ago

@tedil impressive 50% time saved!