messense / crfs-rs

Pure Rust port of CRFsuite: a fast implementation of Conditional Random Fields (CRFs)
MIT License
28 stars 11 forks source link

Decouple inference state from model #9

Open xd009642 opened 2 years ago

xd009642 commented 2 years ago

When looking at the code I saw the viterbi state is held inside the context and mutated. This prevents the tagger from running multiple calls to tag concurrently and limits performance for users where they want to do tags concurrently over multiple elements. Instead they'll need to wrap the tagger in a mutex or clone it and all the data (including the mutable state).

An alternative design which would be more multi-threading friendly would be to split the fields that mutate into a new struct something like ViterbiState and change viterbi implementation into fn viterbi(&self, state: &mut ViterbiState) and then make the tag function in the tagger fn tag(&self, xseq: &[T]) where it creates a ViterbiState and passes it into the call to viterbi. This would also remove/simpliy a bunch of the reset code

messense commented 2 years ago

Thanks for posting this, the code was a naive port of crfsuite, I'm sure a lot can be optimized.