language support weird - Githubissues

ruediger / VobSub2SRT

Converts VobSub subtitles (.idx/.srt format) into .srt subtitles.

GNU General Public License v3.0

294 stars 67 forks source link

language support weird #25

Closed nullren closed 11 years ago

nullren commented 11 years ago

in my subtitles file, when i listlangs, i get "zh", so when i select "zh" with --lang, vobsub2srt looks for zho.traineddata in tesseract. it should be looking for chi_* or perhaps i should be able to specify this thing directly.

ruediger commented 11 years ago

That's the issue with ISO 639-2/3. chi and zho are the ISO 639-2 identifiers. I don't know why tesseract doesn't use ISO 639-1 (zh) instead which is simple, commonly used.

Anyway I added an option --tesseract-lang which you can use to specify the tesseract language to use.

8d34a407b326