issues
search
tesseract-ocr
/
langdata_lstm
Data used for LSTM model training
Apache License 2.0
114
stars
151
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
NO fas.unicharset and fas.xheights file for Persian Language
#60
AinazRafiei
opened
1 month ago
6
Rename frk -> deu_latf (ISO 639-3, ISO 15924)
#59
stweil
closed
4 months ago
15
grc letters with dot below
#57
nisbet-hubbard
opened
6 months ago
0
θ in Greek book font rendered as swash form
#56
nisbet-hubbard
opened
6 months ago
2
Missing GREEK LUNATE SIGMA SYMBOL in grc and script/Greek models
#55
nisbet-hubbard
opened
6 months ago
4
Slight modification in Bodhi for incorporating a few unique characters in Drenjongke
#54
bloodgroup-cplusplus
opened
8 months ago
0
Adding Additional Fonts for bodhi and dzongkha
#53
bloodgroup-cplusplus
opened
8 months ago
0
Adding additional language Denjongke (sikkimese bhutia) to tesseract language dataset
#52
bloodgroup-cplusplus
closed
8 months ago
3
Armenian letter և missing in hye language - confirmation
#51
reneclais
closed
9 months ago
1
Armenian.traineddata contains the missing character, so I suggest to try that model.
#50
reneclais
closed
9 months ago
2
Missed letter in the hye.traineddata
#49
reneclais
opened
9 months ago
3
English traineddata file does not contain the '±' character?
#48
Furtifk
opened
1 year ago
7
Bontot janda
#47
Awiemanja
closed
2 years ago
0
Add Shan language data
#46
ronaldaug
opened
2 years ago
2
Training data should include bullet-like characters
#45
wollmers
opened
2 years ago
0
Added unicharset file to Akkadian language
#44
wincentbalin
closed
2 years ago
1
Update deu.unicharset
#43
OttoKerner
closed
1 week ago
3
Missing some Thai numbers in Thai language (tha)
#42
crossknight
opened
3 years ago
0
Inherited.unicharset built by copying lines from existing unicharsets
#41
Shreeshrii
opened
3 years ago
1
how to train this files to get .traineddata
#40
josef821
closed
1 year ago
3
Update asm.wordlist
#39
hjkgithub
opened
4 years ago
3
Alternative way to download langdata_lstm master file instead from github
#38
timjin520
closed
8 months ago
11
Missing support for Coptic script
#36
stweil
opened
4 years ago
1
Update desired_characters for fin model
#35
jmokoistinen
opened
4 years ago
0
Update dan/desired_characters based on the Swedish one
#34
poizan42
closed
4 years ago
1
Add the "@" character please to the list of desired characters
#30
Furtifk
closed
4 years ago
2
Add support for Shan language (shn)
#33
ronaldaug
closed
2 years ago
8
Danish traineddata file doesn't include the "@" character
#29
Furtifk
opened
4 years ago
9
Tesseract fails to detect letters Å and å in Finnish language.
#31
jmokoistinen
opened
4 years ago
4
Trailing spaces on line 27 of eng.punc
#28
juliangilbey
opened
4 years ago
4
Please use more fonts for training Uyghur
#27
gheyret
opened
4 years ago
0
Normalize unicode in texts
#26
stweil
closed
4 years ago
0
Duplicate fonts names in okfonts
#25
amitdo
closed
4 years ago
2
Support for New Reiwa Era Character ㋿ in Japanese
#32
prateek4sep
opened
4 years ago
1
Please add description for repo - Suggested Text:
#24
Shreeshrii
opened
4 years ago
0
Partially revert commit 02cc8f028532367dd44ba5fb3cbb6ac0bf73d6ad
#23
stweil
closed
5 years ago
2
error related to script data during training
#22
Shreeshrii
closed
5 years ago
9
Add Apache license file
#21
stweil
closed
5 years ago
1
Fix langdata config for Chinese, Japanese and German
#20
stweil
closed
5 years ago
1
Move script data to new script subdirectory
#19
stweil
closed
5 years ago
2
rename kur to kur_ara
#18
Shreeshrii
closed
2 years ago
4
Apparently Lao\Lao.unicharset Has Uncommitted Changes
#17
ColdWinterWind
closed
5 years ago
1
tessedit_ocr_engine_mode 1 for san (Sanskrit language, Devanagari script)
#16
Shreeshrii
closed
5 years ago
1
tessedit_ocr_engine_mode 1 for nep (Nepali language, Devanagari script)
#15
Shreeshrii
closed
5 years ago
0
tessedit_ocr_engine_mode 1 for mar (Marathi language, Devanagari script)
#14
Shreeshrii
closed
5 years ago
0
tessedit_ocr_engine_mode 1 for hin (Hindi language, Devanagari script)
#13
Shreeshrii
closed
5 years ago
0
fix unicharset errors
#12
Timilehin
closed
5 years ago
0
update yoruba unicharset
#11
Timilehin
closed
5 years ago
0
improve yoruba training data quality
#10
Timilehin
closed
5 years ago
0
Should we update swe.training_text if new characters are added to desired_characters ?
#9
aslamy
opened
5 years ago
1
Next