hplgit / doconce

Lightweight markup language - document once, include anywhere
http://hplgit.github.io/doconce/doc/web/index.html
Other
309 stars 60 forks source link

Arabic language support #154

Closed geohadab closed 6 years ago

geohadab commented 6 years ago

hi,

I would like to thank you for maintaining this great tool. I started using and experimenting with doconce for a while and I believe this is best documenting tool I have ever used.

I would like to contribute to this project by adding Arabic language. however, I do not know what I need to do to make doconce support Arabic .

Could you please help me

thank you

KGHustad commented 6 years ago

The language support is quite primitive. Adding support for a new language, is done by creating a new dictionary with mappings between English and the target language. There are around 30 strings that need translation. Additionally, locale and latex-language (used by babel) should be specified.

https://github.com/hplgit/doconce/blob/f0a317ecc87c637b410ea3a1e8196ce14f38df37/lib/doconce/doconce.py#L26-L63

Remember to use unicode literals, e.g. u'spørsmål', for all strings containing non-ASCII characters.

You could provide the translation in a reply here or in a pull request. Feel free to ask if anything is unclear.

geohadab commented 6 years ago

this is the mapping from English to Arabic. I do not know how to map 'locale', could you please help

Arabic ={ 
     'locale': '---', 
     'latex package': 'arabic', 
     'toc': u'الفَهْرس', 
     'Contents': u'المُحْتويات', 
     'Figure': u'رَسْم تَوضِيحي', 
     'Movie': u'فِيلم', 
     'list of': u'قَائِمة', 
     'and': u'و', 
     'Exercise': u'تَمْرِين', 
     'Project': u'مَشْرُوع', 
     'Problem': u'مَسْألة', 
     'Example': u'مِثَال', 
     'Projects': u'مَشَارِيع', 
     'Problems': u'مَسَائِل', 
     'Examples': u'أَمْثِلة', 
     'Preface': u'مُقَدِّمة', 
     'Abstract': u'المُلَّخص', 
     'Summary': u'الخُلاصَة', 
     # Admons 
     'summary': u'الخُلاصة', 
     'hint': u'تَلْمِيح', 
     'question': u'سُؤال', 
     'notice': u'تَنْبِيه', 
     'warning': u'تَحْذِير', 
     # box, quote are silent wrt title 
     'remarks': u'مُلاحَظات', # In exercises 
     # Exercise headings 
     'Solution': u'الحَل', 
     '__Solution.__': u'__الحَل__', 
     '__Answer.__': u'__الجَواب__', 
     '__Hint.__': u'__تَلْميح__', 
     # At the end (in Sphinx) 
     'index': u'الدَّليل', 
     # References 
     'Filename': u'اسم_الملف', 
     'Filenames': u'أسماء_الملفات', 
     }
KGHustad commented 6 years ago

locale is used for formatting the date with DATE: today. The locale code consists of language, country and encoding, so I suggest we set it to ar_SA.UTF-8 (language: Arabic, country: Saudi Arabia).

If I run the following test program, I get (reversed when copying)

الإثنين, 09. أبريل, 2018

which seems reasonable (according to Google translate).

import locale, time

locale.setlocale(locale.LC_ALL, 'ar_SA.UTF-8')
print(time.strftime('%A, %d. %b, %Y'))

I'm not sure how well LaTeX works with right-to-left languages like Arabic (babel might do some magic), but other formats will treat it as a left-to-right language with left-adjustment.

I will add Arabic support soon, so that you can try it out.

geohadab commented 6 years ago

thank you, it looks good. I hope this works with no further modifications.

I am not familiar with how latex works, but according to this link https://www.sharelatex.com/learn/Arabic there are some packages which needs to be included to support right-to-left.

KGHustad commented 6 years ago

Just pushed an update now. If you try installing DocOnce from the git repo with pip, you should be able to test it.

We have a test file which uses all constructs requiring translation: https://github.com/hplgit/doconce/blob/master/doc/src/locale/English.do.txt

You can test it with

doconce format html English.do.txt --html_style=bootstrap_bluegray --encoding=utf-8 --language=Arabic

Note that most of the document (i.e. everything which is explicitly written) will still be in English.

I suspect the right-to-left packages we would need to include, would create problems with other packages, and require more substantial changes to how we generate LaTeX.

geohadab commented 6 years ago

Arabic.do.txt

I created a new file with name Arabic.do.txt ( attached) . I tried to generate the document on Windows using WSL https://docs.microsoft.com/en-us/windows/wsl/install-win10 , but i get an error messege error

which tells me that Arabic is not added.

I dont know what is the problem. It could be the WSL, unfortunately, Now I have only my Windows machine, I will try it on linux or mac OS as soon as i have my macbook fixed

KGHustad commented 6 years ago

You would need to do pip install --upgrade . from the root of the git repo to get the latest version of DocOnce with Arabic support (if you installed DocOnce via conda, then it's best to do this in a separate conda environment), since these changes are only available in the master branch at the moment.

I managed to get Arabic.do.txt to work with some minor adjustments. Thanks for providing the translation!

geohadab commented 6 years ago

I tried and I still get an error. Could you please show me the result

KGHustad commented 6 years ago

Yes, I just pushed the generated HTML. (I found some bugs while generating the LaTeX and Sphinx documents, so they're not up yet.)

geohadab commented 6 years ago

it looks great ... but there two things which needs to be fixed 1) the text must be in the right side 2) sentences with Arabic and English is confusing

how can these problems be solved ?

KGHustad commented 6 years ago

The English words in the HTML are all from the source document. (except box in the header for bbox).

Right-adjusted text is a bit tricky, since the templates we use are all made for left-to-right languages. Full Arabic support for all formats would require a tremendous amount of work, so I will not pursue this any further at this time.