varfish-org / hgvs-rs

A port of biocommons/hgvs to the Rust programming language
Apache License 2.0
11 stars 4 forks source link

Tune translate_cds implementation #80

Closed holtgrewe closed 1 year ago

holtgrewe commented 1 year ago

A lot of time in mehari is consumed in the translate_cds implementation.

We should properly tune and benchmark this function in hgvs-rs.

holtgrewe commented 1 year ago

cargo bench result for current phf implementation

image

holtgrewe commented 1 year ago

After switching to byte arrays with builtin match (two changes, not 100% clear where time is added/lost).

image

holtgrewe commented 1 year ago

Note: converting to a hard-coded trie (nested match) does not help. Creating result with capacity helps a bit (5%).

holtgrewe commented 1 year ago

Even faster, with fast hash maps

image

holtgrewe commented 1 year ago

Changing everything to bytes makes the interfaces painful.

I found an implementation that will allow to keep interfaces using str and String but still gives a speedup of almost 4x.

image