Closed faridoon closed 11 years ago
Hebrew is the other big RTL language, right?
Wikipedia's RTL article currently lists these scripts as RTL:
Arabic, Avestan, Cypriot, Hebrew, Imperial Aramaic, Kharosthi, Lydian, Mandaic, N'Ko, Old South Arabian, Old Turkic, Pahlavi, Phoenician, Samaritan, Syriac, Thaana, Umbrian
cue.lang supports these languages:
Arabic, Catalan, Croatian, Czech, Dutch, Danish, English, Esperanto, Farsi, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Italian, Latin, Norwegian, Polish, Portuguese, Romanian, Russian, Slovenian, Slovak, Spanish, Swedish, Turkish
The only overlap is Arabic & Hebrew. I think I could put in a check to see if cue detects either of those, and render RTL. Does that sound about right?
Oh, here's the cue.lang readme: https://github.com/vcl/cue.language
In one of the examples of Wordcram (fromWebPage), I tested the Arabic Wikipedia's homepage. Unfortunately, the words are still printed from Left to Right. Below is a screenshot:
And the same problem persists with Farsi language:
@faridoon, take a look at these PNGs I generated - the first is without the fix, the second is with. Let me know if everything looks good, and I'll merge this into rel060, so it'll be in the next release.
(Actually, I just noticed the LTR words in there are backwords, hah! like "hsilgnE". I think that's ok.)
Buggy:
Fixed:
Thank you Dan. It's great. Isn't it possible to render RTL and LTR scripts separately in the sketch? I think one can ignore the LTR scripts inside a big wordcram of RTL language (e.g. Arabic or Farsi). But if there are two languages of equal weight in the wordcram, they wouldn't look good.
When can we expect the next release?
Awesome! I'll merge this into the 060 release branch now.
WordCram treats the text as one whole body of text, and uses cue.lang to guess which language it's in (the method is literally named guess
). It's probabilistic, and works better on larger sets, so, on individual words, it'd probably guess wrong much of the time.
(But you've got me thinking now - maybe RTL can be a property of the word, so you can set them individually. I really didn't like the way I fixed this RTL bug - this suggests a cleaner path, that gives users more control. Great suggestion!)
We're working on the next release now, and I'd love to have it done this summer. You can track our progress on this pull request: https://github.com/danbernier/WordCram/pull/38
Wordcram works well with many languages but when it comes to Arabic, it displays it from left to right whereas it should be from Right to Left. I hope someone looks into this issue.