olikraus / u8g2

U8glib library for monochrome displays, version 2
Other
5.14k stars 1.05k forks source link

@olikraus / How to customize my own set of Chinese characters to show in LCD? #510

Closed mianqi2016 closed 6 years ago

mianqi2016 commented 6 years ago

Hi, sir: I am a Chinese user, I used your U8g2 library in my UNO project. My design is: editing characters in Android phone, send it to UNO via Bluetooth, then show it on LCD12864, so I could update the content of LCD12864 in time.

Now, the problem is: when I sent number or alphabet, it was OK, while for Chinese characters, it was far from OK - the showing on LCD12864(model:ST7920) was odd, irregular and puzzling, it seems that some characters missed. I guess this was because I have not totally know your lib's internal mechanism. For example, I try to input a poem: "白日依山尽", it only showed "日山" - the second and fourth character in the whole poem.

Could you explain this and give a solution?

include

include

include

include

include

U8G2_ST7920_128X64_F_SW_SPI u8g2(U8G2_R0, / clock=/ 13, / data=/ 11, / CS=/ 10, / reset=/ 8);

void setup() { Serial.begin(9600); u8g2.begin(); u8g2.enableUTF8Print(); }

void loop() { u8g2.setFont(u8g2_font_unifont_t_chinese2); // use chinese2 for all the glyphs of "你好世界" u8g2.clearBuffer(); u8g2.drawUTF8(0,30,"白日依山尽,"); // Chinese "Hello World" u8g2.drawUTF8(0,60,"黄河入海流。"); //u8g2.drawUTF8(0,60,"欲穷千里目,"); //u8g2.drawUTF8(0,90,"更上一层楼。"); u8g2.sendBuffer();
delay(1000); }

mianqi2016 commented 6 years ago

mail to U8g2 library writer

Since I couldn't find the writer's contacts, I wrote the mail here.

olikraus commented 6 years ago

Due to the huge number of glyphs in the Chinese language a u8g2 font will not include all fonts. Instead only a subset of the chars will be included in a font. The full set of all Chinese chars will be included in the next release of u8g2. A beta release can be downloaded here: https://github.com/olikraus/U8g2_Arduino/archive/master.zip

This zip file can be installed as library via the add zip flle menu in the Arduino IDE:

The new version of u8g2 will include all glyphs from the GB2312 standard: https://github.com/olikraus/u8g2/wiki/fntgrpwqy

So for example, as a font you have to use u8g2_font_wqy12_t_gb2312b.

The u8g2 beta also includes a new example to demonstrate the usage of the fonts: https://github.com/olikraus/u8g2/blob/master/sys/arduino/u8g2_full_buffer/Shennong/Shennong.ino

Please note that the font u8g2_font_wqy12_t_gb2312b is very big and the resulting code will not fit into the Uno. Best is to use a MKRZERO, Arduino Zero or Arduino Due. Also ESP32 or STM32 based boards will work. You just need sufficient flash ROM.

Since I couldn't find the writer's contacts, I wrote the mail here.

which is good. Always put questions related to u8g2 either here or in the Arduino Forum (Display Subsection).

mianqi2016 commented 6 years ago

Hi, olikraus: I've tested a lot, made some progress, but still many questions.

Firstly, I found that:
u8g2_font_unifont_t_chinese1 + u8g2_font_unifont_t_chinese2 + u8g2_font_unifont_t_chinese3 covers most commonly-used Chinese characters, but u8g2_font_unifont_t_chinese3 is not supported in U8g2-2.19.8.

So how to add it into the library? since it has already been defined in u8g2_fonts.c.

When I use the chinese3 font in *.ino file, this error occurs: C:\Users\hp\AppData\Local\Temp\cc60PDff.ltrans1.ltrans.o: In function u8g2_font_get_word': C:\Users\hp\Documents\Arduino\libraries\U8g2-2.19.8\src\clib/u8g2_font.c:121: undefined reference tou8g2_font_unifont_t_chinese3' C:\Users\hp\Documents\Arduino\libraries\U8g2-2.19.8\src\clib/u8g2_font.c:121: undefined reference to `u8g2_font_unifont_t_chinese3'

I'm using an Arduino UNO, so the latest version U8g2_Arduino-master library, which requires more storage, cannot be used in it. I installed the U8g2-2.19.8 with both chinese1 and the chinese2 font support, but not the chinese3. If it's included, my project could be finished.

I made tests by the shennong.ino file from the U8g2_Arduino-master library. Open it in U8g2-2.19.8, the characters in shennong.ino shows in LCD very well, except those not included in chinese2 itself. To do this, I did these steps: 1.delete half the contents in const char c_str[] 2.set glyph_height = 20; // glyph_height = u8g2.getMaxCharHeight(); 3.change: #define FONT u8g2_font_wqy14_t_gb2312b into: u8g2_font_unifont_t_chinese2.

olikraus commented 6 years ago

I am a little bit surprised. shennong.ino has been added to release 2.21 only: https://www.arduinolibraries.info/libraries/u8g2 Maybe this is why you observe some difficulties, because your system seem to have two versions of u8g2 installed in parallel, which confuses the compilation process. Maybe you should search and delete redundent and old versions of the library on your hard disk.

All fonts are located here: https://github.com/olikraus/u8g2/blob/master/csrc/u8g2_fonts.c You should be able to search for _chinese3 in the raw data of that font.

Also note that I implemented the suggestion from the Chinese community to add WenQuanYi bitmap fonts: https://github.com/olikraus/u8g2/wiki/fntgrpwqy

These fonts are also available as __gb2312 set so that you have allmost all of your glyphs avilable. The WQY fonts are also avilable in different font sizes, which might be another advantage over unifont. But of course I do not believe that all these font data fits into an Uno controller. I suggest to shift to the Arduino Zero line of processors.

3.change: #define FONT u8g2_font_wqy14_t_gb2312b into: u8g2_font_unifont_t_chinese2.

No doubt, the shennong.ino is a stresstest for Arduino. chinese2 has by far not all the glyphs included for this nice story.

BTW: If you encounter any Chinese related spelling issues, please let me know. I would call my knowledge on Chinese language "absolut zero". I do not even have a feeling whether the scrolling speed is slow enough for reading.

mianqi2016 commented 6 years ago

Hi, olikraus:

http://www.china-language.gov.cn/ - This is the official website for the China National Language & Character Working Committee, the authority agency for standard usage of characters in P.R.C.

You could download file "General Standard Chinese Characters Table" in this link, it's too big to upload into this issue:

http://www.china-language.gov.cn/fw/zwxxhpt/201611/P020161115557688945267.zip

The document contains totally 8015 characters, the most commonly used, according to their usage frequency. Among them, 3500 are the basic.

I provide a support for you, in the attachment - a brief introduction for this table.

[Uploading Introduction to General Standard Chinese Characters Table.txt…]

mianqi2016 commented 6 years ago

Introduction to General Standard Chinese Characters Table.txt

olikraus commented 6 years ago

Actually my work is based on this github project: https://github.com/larryli/u8g2_wqy I assume that person has a good Chinese background. From there I hope that the character table is correct.

mianqi2016 commented 6 years ago

It's a lot of works, a big effort.

I have one question more. When I tried to create font code for individual Chinese characters to be added into *.ino file, always problems.

I've read these two thread: https://github.com/olikraus/u8g2/issues/126 https://github.com/olikraus/u8g2/issues/105

I noticed this word: @olikraus olikraus referenced this issue on 2 Jun 2017 Closed add extended fonts, -m '32-701' ("e" range) #284

So how to set the range for -m here?

Thanks.

olikraus commented 6 years ago

If I understand correctly, you want to have a font, with exactly a specific set of characters, correct?

ok, then let me explain -m a little bit: -m receives an argument, which itself is a little programming language. The expression to -m is a comma spearated list of ranges or individual glyphs, which need to added to the font, so for example -m '32' will add the whitespace only. -m '32-128' will add the ASCII page (from 32 to 128) to the font You can add hex code also, if they are preceeded with $, so to make a font, which only includes a space, then use this: -m '$20' Maybe you know, that the capital A has the unicode 65 and B nas 66, then adding both chars is: -m '65-66' As I mentioned earlier, you could separate multiple unicode glyphs by comma, so adding A and B to the font is also: -m '65,66' Let me make a more complete example. Assume you want to wrote "moon" on your display and you only need the Chinese gylphs for this (and nothing else). If google translate is correct, than moon is: 蒙德 Now goto a online converter to get the unicode values for these two glyphs (e.g. this one: https://r12a.github.io/apps/conversion/) So you enter 蒙德 and you will get the hex numbers 8499 5FB7 somewhere below on this converter page. Now you have to create a font with these two chars: -m '$8499, $5fb7' This will generate a font exactly with these two glyphs. Maybe you want chars from the ASCII page also, then use: -m '32-128,$8499, $5fb7' Maybe later you want to add "midnight", which could be 午夜, again the hex codes are: 5348 591C So, to write moon and midnight in English and Chinese you need a font with this -m command: -m '32-128,$8499, $5fb7, $5348,$591C' So the -m command might be as long as you require this. At some point of time, you may see, that the command line options are too long. You can store the -m expression also as a file, which might be more convenient for editing. For this, put 32-128,$8499, $5fb7, $5348,$591C into a file (say myglyphs.txt) and call bdfconv with an uppercase -M: -M myglyphs.txt For the generation of the u8g2 fonts, I have also used the -M approach and you can find the files also online. Here is the file for the gb2312 font: https://github.com/olikraus/u8g2/blob/master/tools/font/build/gb2312.map You can download gb2312.map and regenerate the font with -M gb2312.map You could also remove glyphs which you do not need from gb2312.map and generate a smaller version of the font.

Does this answer your question?

Update: Correct translation for "moon" should be"月亮", I leave the above text unchanged, please excause.

mianqi2016 commented 6 years ago

That's exactly what I mean! Thank you so much! Your explanation is so clear! This issue really reaches the point.

I'm using an Arduino UNO, and cann't easily get a MKRZERO, Arduino Zero or Arduino Due through on-line service here. So I want to reduce the size of the font file to contain only 200 characters, that's enough for my project.

I've tried the steps described in those two threads mentioned above to create such a file, but failed, the output code for several characters even look not correct in appearance, comparing to those listed in u8g2_font_unifont_t_chinese2, and actually not work when added into the *.ino file. - Now I know, there's something wrong in my operation, I'll retry later on.

I found this article: http://blog.csdn.net/menghuanbeike/article/details/75666266, which is very helpful, it describes the steps for customizing the font file, I followed, still some problems, but there's a hope to settle it, with the help of your detailed explanation.

Besides, the Google translation for "moon" is wrong, "蒙德" means nothing in Chinese, maybe you have type in the wrong word, since "moon" is a commonly used one, which is "月亮" in Chinese, a very beautiful and elegant word only if you could understand Chinese characters. The characters itself can express.

And Google translation for "midnight" as 午夜, that is right.

I'll continue try and report the result here, there should be a solution.

olikraus commented 6 years ago

Thanks for your valuable remarks. I fixed several spelling errors in the my own reply to make it (hopefully) even more clearer. Please excause my lack of knowledge on the Chinese language. I get more and more interrested in your language through my work on this project, but I am far away from understanding your glyphs.

mianqi2016 commented 6 years ago

It has well done, perfect, after overcome the final difficulties.

Firstly, I run the command:

bdfconv.exe -v -f 1 -M C:\Users\hp\Desktop\temp\1.map -n myfont_poem_1 -o C:\Users\hp\Desktop\temp\myfont_poem_1.c

nothing happened.

Then, I return to the article mentioned above, read it carefully again, copied the required files and put them in the proper place(I've downloaded the bdfconv.exe separately.), then changed the command into this way:

bdfconv.exe -v bdf/unifont.bdf -f 1 -M C:\Users\hp\Desktop\temp\3.map -d bdf/7x13.bdf -n myfont_poem_3 -o C:\Users\hp\Desktop\temp\myfont_poem_3.c

Press Enter, immediately, data appeared in the terminal scroll upward in an ordered sequence. I know it works. Really is.

It's so easy and clear now looking backward, but at the beginning, just like a high mountain stationed ahead, - unknown, with lots of clues and folks, and the target somewhere in an unseen far-away...

I've tested the official chinese2.map, compared the output C file to those within the u8g2_fonts.c, they are the same. Next, I compiled a map file of those characters needed, converted, pasted into *.ino file, run, everything is OK, just as expected.

It's such an excellent work! It functions like the old Typography, which is still in use here in the 1980s. Professional workers select required lead-based movable types, arrange them in order, to print an article, afterwards, those types are collected, and rearranged to print a new one. Amount of most commonly used Chinese characters are about 2000+, their various combination could express all the ideas.

I've re-edited the initial content published in this issue to make it more understandable, - it's published by one teammate, he think we should directly seek help here, he's right. Also changed the title of this issue, making it more easily recognized, so many others puzzled by this question, like me, could benefit from it.

I did try the shengnong.ino in U8g2-2.19.8 version, with little changes in the code as I said, the scroll speed is good, only a little blur. I'll test it in another computer to see weather it really could run in U8g2-2.19.8 version.

Serve both of us a cup of tea.

olikraus commented 6 years ago

Thanks for your input. Hopefully others will also find this thread if they face the same problem.

Instead of unifont.bdf, I just wanted to mention, that there are also some other Chinese fonts, which are now available in u8g2 and which could be used for your bdf-conversion: wenquanyi_9pt.bdf, wenquanyi_10pt.bdf, wenquanyi_11pt.bdf, wenquanyi_12pt.bdf, wenquanyi_13px.bdf (download from here: https://github.com/olikraus/u8g2/tree/master/tools/font/bdf).

I have updated these bdf files, so that they work correctly with bdfconv.exe.

Serve both of us a cup of tea.

I will do this and think on Shen Nong. :-)

mianqi2016 commented 6 years ago

I've re-edited the initial thread, to make it more focus on it's target - how to display Chinese characters as desired in LCD. It's improper to re-edit the original document since it recorded where we come from, but in this case, I think, acceptable to make it more clear.

I'll try these fonts. Actually, the word "font" is confusing in this issue. It describes different style, or variation of the appearance is writing symbols, right? That's my understanding. In Latin language, such as English or Deutsch, which have limited alphabet, "font" could define the outlook of final output, but there are about 2000+ most commonly-used Chinese characters, so, one important consideration is AMOUNT, that's, how many characters could be displayed effectively, with a high efficiency, - which has been resolved here.

Another word: "subset", you have mentioned in one thread, is more accurate to describe this problem. Generally, LCD is used to show info in a special area or topic, and target a special audience, so the actual amount of characters used in one case could be very limited, maybe hundred, e.g. to show temperature and humidity, only dozens characters needed. Therefore, customization of "subset" is a greatly practical way.

Thank for your effort and all the others contributed.

Have a nice day.