rth / vtext

Simple NLP in Rust with Python bindings
Apache License 2.0
147 stars 11 forks source link

API Set parameters with the builder pattern #57

Closed rth closed 4 years ago

rth commented 4 years ago

This refactors the API to enable setting parameters with the builder pattern, e.g.

let tokenizer = VTextTokenizerParams::default()
     .lang("en")
     .build()
     .unwrap();

which in this case is equivalent to VTextTokenizer::default().

Vectorizers can similarly be intialized with,

let vect = CountVectorizerParams::default()
           .tokenizer(tokenizer.clone())
           .n_jobs(4)
           .build()
           .unwrap();

with the default implementations with e.g. CountVectorizer::<VTextTokenizer>::default(). The API may still change somewhat in this latter part.

Model parameters are now consistently stored as,

struct Model {
  params: ModelParams,
  // + weights etc
}

I have not used derive_builder so far, so this does add some boilerplate in the code.