issues
search
tesseract-ocr
/
langdata
Source training data for Tesseract for lots of languages
Apache License 2.0
826
stars
886
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
add IAST version of san and hin training text
#115
Shreeshrii
closed
6 years ago
0
Update bihari wordlist
#114
Shreeshrii
closed
6 years ago
0
replace with a more representative bihari text
#113
Shreeshrii
closed
6 years ago
0
basic Maori data (wikipedia)
#112
jimregan
closed
6 years ago
0
Update Latin langdata
#111
jimregan
closed
6 years ago
0
Fix extra intra-word-spacing for korean
#110
Shreeshrii
closed
6 years ago
0
Fixes extra intra-word spacing in Chinese for 4.0
#109
Shreeshrii
closed
6 years ago
0
Fixes extra intra-word-spaces problem with 4.0
#108
Shreeshrii
closed
6 years ago
1
Addresses extra spaces problem with 4.00
#107
Shreeshrii
closed
6 years ago
0
Wordlist cleaning (lots of incomplete words found in Thai wordlist)
#106
bact
closed
6 years ago
8
Please update new 4.0 version of langdata
#105
hoangtocdo90
closed
6 years ago
1
urd.wordlist
#104
alonehoney
closed
6 years ago
0
urdu.wordlist
#103
alonehoney
opened
6 years ago
0
Update ori.numbers
#102
indiclinguist
opened
6 years ago
0
Update desired_characters
#101
indiclinguist
closed
6 years ago
0
Update ori.numbers
#100
indiclinguist
opened
6 years ago
2
How to Add or Edit [script].unicharset in langdata folder?
#99
sethleech
opened
6 years ago
3
List of fonts for training lang
#98
masztal
closed
6 years ago
3
Add Latin Extended-A script for Polynesian languages
#97
HURIMOZ
closed
2 months ago
13
Missing langdata files for div
#96
Shreeshrii
closed
5 years ago
1
Missing Thaana.unicharset
#95
Shreeshrii
closed
5 years ago
1
why eng.training_text just has 72 lines?
#94
xiaomaxiao
closed
6 years ago
4
what's the difference among chi_sim.traineddata and these chi_sim.* files?
#93
wamlvaw
closed
6 years ago
2
how to train text for multiple fonts?
#92
aijianiula0601
closed
6 years ago
2
Missing Norwegian special characters in desired_characters file
#91
Andrioden
opened
6 years ago
1
[Feature request] using Free Uyghur fonts for Uyghur language training
#90
gheyret
opened
6 years ago
1
Improve yor.traineddata for Yoruba
#89
Shreeshrii
opened
6 years ago
9
Add support for Moroccan Amazigh language (ZGH),
#88
Shreeshrii
opened
6 years ago
1
zgh_tifinagh
#87
agnagay
closed
6 years ago
3
[Feature request] font list for LSTM
#86
amitdo
closed
5 years ago
17
Yiddish
#85
amitdo
opened
6 years ago
6
Add Filipino lang
#84
JohnHenryGaspay
opened
6 years ago
7
Updated langdata
#83
ahmed-alaa
opened
6 years ago
23
Hebrew issues
#82
amitdo
opened
6 years ago
63
Add Half-width Katakana for Japanese
#81
Shreeshrii
opened
6 years ago
9
Add checkboxes to English Traineddata
#80
Shreeshrii
opened
7 years ago
1
Add 3.05 branch
#79
Shreeshrii
closed
6 years ago
6
Add grc subdirectory with component files
#78
Arithmeticus
closed
6 years ago
7
ita: Remove user words
#77
stweil
closed
7 years ago
0
Use LSTM Engine for hin, nep, mar, san
#76
Shreeshrii
closed
6 years ago
4
Need Cube Trained Tessdata for tesseract 3.04.
#75
MerlinArul
closed
7 years ago
1
Devanagari script texts in non-Hindi languages OCR better with hin.traineddata
#74
Shreeshrii
closed
6 years ago
1
tessedit_load_sublangs chi_sim_vert
#73
Shreeshrii
closed
6 years ago
1
Add Extended Arabic-Indic Digits to Persian, Urdu and Sindhi
#72
Shreeshrii
closed
6 years ago
10
Add Arabic-Indic numerals to Arabic
#71
Shreeshrii
closed
6 years ago
9
Because there are too many Chinese characters, how to quickly train the required data?
#70
zouhuigang
closed
6 years ago
1
Add vulgar fraction for 1/2
#69
Shreeshrii
opened
7 years ago
10
About Uyghur(Uighur) langdata
#68
gheyret
opened
7 years ago
2
Add support for Armenian
#67
Shreeshrii
closed
3 years ago
35
Vietnamese
#66
Shreeshrii
opened
7 years ago
3
Previous
Next