๐ Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
MIT License
742
stars
75
forks
source link
LanguageDetector.FromStoreAsync(): Why can't we pass an Array of Language? #87
Good evening
Thank you very much for sharing your great work!
When testing LanguageDetector, it often happens that Catalyst recognizes languages of which it is already clear in advance that these languages are not even an option.
Therefore, it would be very useful if we could provide LanguageDetector with List<Language> to tell which languages are possible at all.
Is your feature request related to a problem? Please describe.
For example, if one only uses English, German, and French texts, LanguageDetector often detects Norwegian.
Describe the solution you'd like
I am pretty sure that if we can help LanguageDetector and say that the text can only be in one of three languages, it will then hit the right language much better ๐.
Describe alternatives you've considered
I tried downloading just the NuGet language models for English, German and French, but LanguageDetector nevertheless detected Norwegian. Very strange. it looks like LanguageDetector is automatically downloading Language Models (great!), but there is no word about this feature in the code comment ๐ข
Good evening Thank you very much for sharing your great work!
When testing
LanguageDetector
, it often happens that Catalyst recognizes languages of which it is already clear in advance that these languages are not even an option.Therefore, it would be very useful if we could provide
LanguageDetector
withList<Language>
to tell which languages are possible at all.Is your feature request related to a problem? Please describe. For example, if one only uses English, German, and French texts, LanguageDetector often detects Norwegian.
Describe the solution you'd like I am pretty sure that if we can help
LanguageDetector
and say that the text can only be in one of three languages, it will then hit the right language much better ๐.Describe alternatives you've considered I tried downloading just the NuGet language models for English, German and French, but
LanguageDetector
nevertheless detected Norwegian. Very strange. it looks likeLanguageDetector
is automatically downloading Language Models (great!), but there is no word about this feature in the code comment ๐ขThanks a lot, kind regards, Thomas