Open btc opened 3 years ago
Further exploration has led me to discover that this may be occurring simply because Idf isn't exposed. Would it be okay for me to submit a patch which modifies lib.rs to expose Idf?
Hi! That would be fine to submit that patch. I'm surprised that that trait isn't exposed already!
Is it true that the idf implementations exposed by this crate all require a O(n) linear iteration over the documents/corpus?
Just an FYI, I'm only passively maintaining this project. I'm not sure if I'm able to answer your question, since I haven't look at this code in a long time.
Hi! That would be fine to submit that patch. I'm surprised that that trait isn't exposed already!
Here you go:
Hi! Thanks for publishing this software. It's quite helpful to potentially be able to use your library instead of re-implementing TFIDF myself. I am grateful for the time and attention you've given to this.
For my use-case, I am working with a large corpus of documents and trying to understand if I can use this library in a way which will have suitable performance.
Examples in this repo show examples of the form:
where
compute_tfidf
is eitherTfIdfDefault::tfidf
orMyTfIdfStrategy::tfidf
idf
implementations exposed by this crate all require a O(n) linear iteration over the documents/corpus?idf
functions on their own, without going throughtfidf
?Presented in pseudocode here, I would like to do the following:
Concretely, I have tried to use the library in the following way, but ran into an error that I don't quite understand yet:
Error 1:
Error 2:
Further exploration has led me to discover that this may be occurring simply because
Idf
isn't exposed. Would it be okay for me to submit a patch which modifieslib.rs
to exposeIdf
?by changing:
proposed: