-
Some C++ programmers use alternative tokens instead of symbol tokens. We should have a way to ensure that the tokens can be checked.
```yaml
LogicalOperatorSpelling: Symbol|Words|AsIs|Custom
Lo…
-
Hello,
I'm testing Hunspell (1.7.0) capabilities for the Hungarian language with LibreOffice dictionary and aff files, and I'm amazed at how good it handles difficult compound words, complex suffix…
-
有些組合字看起來順眼、直覺,例如 accountNumber; 有些看起來則是拖泥帶水,這其中的眉角是什麼? 🤔
參考資料:
* https://grammar.yourdictionary.com/parts-of-speech/nouns/compound-noun.html
* https://www.mit.edu/course/21/21.guide/stacked.htm…
twy30 updated
3 years ago
-
It would be helpful to provide phrase hints (context words) during inference time to boost probability of certain domain specific phrases in the transcription.
E.g. when passing an audio to python…
-
I'm using Tesseract with legacy dictionary (no lstm dictionary) for Japanese recognition.
I added custom words on user-words dictionary, but it seems works when full-matched a word.
Japanese langu…
-
@danielnaber, looking at our artificially generated compounds with unwarranted spaces reveals quite a few funny compound splittings. Here's a first series of wrongly split words:
```
auszug; Ehen …
-
This issue was created automatically with bugzilla2github
# Bugzilla Bug 2686
Date: 2020-09-30T13:24:59+02:00
From: Thomas Omma <>
To: Linda Wiechetek <>
CC: linda.wiechetek, sjur.n.mo…
-
The dictionary contains phrases like "fed up" but since the code checks if words are in the dictionary on a word by word basis, these phrases never hit:
```
> from vaderSentiment.vaderSentiment impo…
-
I have a workable beginning of a Toki Pona sentence generator (a constrained Markov chain).
All it should take to finish is to generate enough to pass the 50,000 words threshold.
EDIT: Toki Pona i…
-
In issue #120, we described the possibility of showing the document in alternative views with different highlights. Now, with the four units of learning almost decided upon, we may actually need this …