greyblake / whatlang-rs

Natural language detection library for Rust. Try demo online: https://whatlang.org/
https://whatlang.org/
MIT License
969 stars 109 forks source link

exposing raw_detect_script() #121

Open thed0ct0r opened 2 years ago

thed0ct0r commented 2 years ago

have to say i really love the work put into this very nice crate. the optimizations are very thoughtful.

i'm working on a little side/pet project that analyzes commit messages and code comments in git repositories. and for my use case i'm interested in all the scripts found in the text. raw_detect_script() is perfect for me - as it returns the whole array of scripts detected and i can calculate their ratios, etc...

this can probably benefit others as well, but ofcourse - i do not pretend to speak for anyone else.

if preferable i can send this as a pull request - did not know what would be more appropriate here.

greyblake commented 2 years ago

@thed0ct0r Hi, I see what you mean. I had this in mind from day 0 of the project, but I was not issue about making this part of the API public, because the heuristics may evolve over time.

I can not promise that I will change something soon. I work on the project from time to time when I want to.. It's rather my therapy.

For now you can use dev feature if you want access to that raw_ functions. Keep in mind, that I don't give any guarantees and it may be changed in the further version of the lib. Primary goal of dev feature is to expose some internals of the lib, so I can benchmark them separately.

For an example, please check https://github.com/whatlang/whatlang-accuracy-benchmark