Example on how to get Unicode input chars to work?

ollex commented 3 years ago

Hi, so far I've had an awesome experience with your lib! The only downside right now is - by the sake of me, I can't get any unicode characters to display correctly. I have tried to programmatically use lcdDrawSJISChar and lcdDrawUTF8Char and even tried to simply add some Umlaute to your test.sh file. What is diplayed is either the wrong chars (looking like asian chars) or one or two empty (space) chars. I have tried to use set_locale, too for example with export LC_ALL=C.UTF-8 in shell script but to no avail. What would be the correct way to display utf-8 encoded input with your library? Thanks for your pointers in advance!

nopnop2002 commented 3 years ago

Hello.

This tool uses fontx format font files to expand font patterns into bitmaps.

To display Unicode characters, you need a font file that supports Unicode characters.

The easiest way is to rewrite the font file using the font file editor. Ex) u-->ü

If you wish, we will provide you with font file editor information.

ollex commented 3 years ago

Hi, @nopnop2002 thanks for your quick reply - I'll look into creating an appropriate fontx file,then. What is the font file editor?

ollex commented 3 years ago

If you could provide the font file editor info you mention I would really appreciate it to quickly understand the mapping you're applying. I can find Fontendit which can create FMT files but I probably need to understand the mapping you're applying in the UTF2JSIS function in your lib

nopnop2002 commented 3 years ago

Added font information to README.

UTF2JSIS converts UTF to ShiftJIS.

Shift JIS is a character code used in Japan.

ollex commented 3 years ago

Added font information to README.

UTF2JSIS converts UTF to ShiftJIS.

Shift JIS is a character code used in Japan.

your're quick !!

ollex commented 3 years ago

..the download doesn't work for the editor :(

nopnop2002 commented 3 years ago

http://elm-chan.org/fsw/fontxedit.zip

Don't work?

ollex commented 3 years ago

it worked directly from the download page, the thing that doesn't work is the link here from ther README.md file

nopnop2002 commented 3 years ago

the thing that doesn't work is the link here from ther README.md file

I don't know why......

ollex commented 3 years ago

No problem, I could download it directly from the website !

nopnop2002 commented 3 years ago

I don't know German.

So please tell me.

Do you use both [u] and [u with umlauts] in German?

ollex commented 3 years ago

Yes. I tried to use the standard numbers (so for example C4 or decimal 196 for Ä - but it doesn't map correctly (changed your ...24.FMT file to contain an Ä at that position, but the mapping doesn't come up correctly. I only see an empty space like before :(

ollex commented 3 years ago

we do have ä,ö,ü, also ß and so on. I was basically thinking how to map utf-8 (most of the commonly used ones) into the FMT file. The problem I have right now is the mapping is not working out of the box. I do think I have the following problem: I am sending a utf-8 encoded string (using JSON) over stdin to a wrapper over your lib, then I try to do: lcdDrawUTF8Char(fx, xpos, ypos,(unsigned char*) (& text->valuestring[i]), c->valueint); (the text->valuestring[i] contains a string that I parse coming from stdin. It works for any normal ascii chars, but for any of the ö, ä etc. it doesn't work, somehow the normal utf-8 codes for anything larger than 128 (196 or hex c4 for Ä for example) don't get into their place in the UTF2JSIS function I guess

ollex commented 3 years ago

basically what I try to achieve is from this input format (utf8) to an .FMT file:

UnicodeCodepos.	Zeichen	UTF-8(hex.)	Name
U+0000		00
U+0001		01
U+0002		02
U+0003		03
U+0004		04
U+0005		05
U+0006		06
U+0007		07
U+0008		08
U+0009		09
U+000A		0a
U+000B		0b
U+000C		0c
U+000D		0d
U+000E		0e
U+000F		0f
U+0010		10
U+0011		11
U+0012		12
U+0013		13
U+0014		14
U+0015		15
U+0016		16
U+0017		17
U+0018		18
U+0019		19
U+001A		1a
U+001B		1b
U+001C		1c
U+001D		1d
U+001E		1e
U+001F		1f
U+0020		20	SPACE
U+0021	!	21	EXCLAMATION MARK
U+0022	"	22	QUOTATION MARK
U+0023	#	23	NUMBER SIGN
U+0024	$	24	DOLLAR SIGN
U+0025	%	25	PERCENT SIGN
U+0026	&	26	AMPERSAND
U+0027	'	27	APOSTROPHE
U+0028	(	28	LEFT PARENTHESIS
U+0029	)	29	RIGHT PARENTHESIS
U+002A	*	2a	ASTERISK
U+002B	+	2b	PLUS SIGN
U+002C	,	2c	COMMA
U+002D	-	2d	HYPHEN-MINUS
U+002E	.	2e	FULL STOP
U+002F	/	2f	SOLIDUS
U+0030	0	30	DIGIT ZERO
U+0031	1	31	DIGIT ONE
U+0032	2	32	DIGIT TWO
U+0033	3	33	DIGIT THREE
U+0034	4	34	DIGIT FOUR
U+0035	5	35	DIGIT FIVE
U+0036	6	36	DIGIT SIX
U+0037	7	37	DIGIT SEVEN
U+0038	8	38	DIGIT EIGHT
U+0039	9	39	DIGIT NINE
U+003A	:	3a	COLON
U+003B	;	3b	SEMICOLON
U+003C	<	3c	LESS-THAN SIGN
U+003D	=	3d	EQUALS SIGN
U+003E	>	3e	GREATER-THAN SIGN
U+003F	?	3f	QUESTION MARK
U+0040	@	40	COMMERCIAL AT
U+0041	A	41	LATIN CAPITAL LETTER A
U+0042	B	42	LATIN CAPITAL LETTER B
U+0043	C	43	LATIN CAPITAL LETTER C
U+0044	D	44	LATIN CAPITAL LETTER D
U+0045	E	45	LATIN CAPITAL LETTER E
U+0046	F	46	LATIN CAPITAL LETTER F
U+0047	G	47	LATIN CAPITAL LETTER G
U+0048	H	48	LATIN CAPITAL LETTER H
U+0049	I	49	LATIN CAPITAL LETTER I
U+004A	J	4a	LATIN CAPITAL LETTER J
U+004B	K	4b	LATIN CAPITAL LETTER K
U+004C	L	4c	LATIN CAPITAL LETTER L
U+004D	M	4d	LATIN CAPITAL LETTER M
U+004E	N	4e	LATIN CAPITAL LETTER N
U+004F	O	4f	LATIN CAPITAL LETTER O
U+0050	P	50	LATIN CAPITAL LETTER P
U+0051	Q	51	LATIN CAPITAL LETTER Q
U+0052	R	52	LATIN CAPITAL LETTER R
U+0053	S	53	LATIN CAPITAL LETTER S
U+0054	T	54	LATIN CAPITAL LETTER T
U+0055	U	55	LATIN CAPITAL LETTER U
U+0056	V	56	LATIN CAPITAL LETTER V
U+0057	W	57	LATIN CAPITAL LETTER W
U+0058	X	58	LATIN CAPITAL LETTER X
U+0059	Y	59	LATIN CAPITAL LETTER Y
U+005A	Z	5a	LATIN CAPITAL LETTER Z
U+005B	[	5b	LEFT SQUARE BRACKET
U+005C	\	5c	REVERSE SOLIDUS
U+005D	]	5d	RIGHT SQUARE BRACKET
U+005E	^	5e	CIRCUMFLEX ACCENT
U+005F	_	5f	LOW LINE
U+0060	`	60	GRAVE ACCENT
U+0061	a	61	LATIN SMALL LETTER A
U+0062	b	62	LATIN SMALL LETTER B
U+0063	c	63	LATIN SMALL LETTER C
U+0064	d	64	LATIN SMALL LETTER D
U+0065	e	65	LATIN SMALL LETTER E
U+0066	f	66	LATIN SMALL LETTER F
U+0067	g	67	LATIN SMALL LETTER G
U+0068	h	68	LATIN SMALL LETTER H
U+0069	i	69	LATIN SMALL LETTER I
U+006A	j	6a	LATIN SMALL LETTER J
U+006B	k	6b	LATIN SMALL LETTER K
U+006C	l	6c	LATIN SMALL LETTER L
U+006D	m	6d	LATIN SMALL LETTER M
U+006E	n	6e	LATIN SMALL LETTER N
U+006F	o	6f	LATIN SMALL LETTER O
U+0070	p	70	LATIN SMALL LETTER P
U+0071	q	71	LATIN SMALL LETTER Q
U+0072	r	72	LATIN SMALL LETTER R
U+0073	s	73	LATIN SMALL LETTER S
U+0074	t	74	LATIN SMALL LETTER T
U+0075	u	75	LATIN SMALL LETTER U
U+0076	v	76	LATIN SMALL LETTER V
U+0077	w	77	LATIN SMALL LETTER W
U+0078	x	78	LATIN SMALL LETTER X
U+0079	y	79	LATIN SMALL LETTER Y
U+007A	z	7a	LATIN SMALL LETTER Z
U+007B	{	7b	LEFT CURLY BRACKET
U+007C	\|	7c	VERTICAL LINE
U+007D	}	7d	RIGHT CURLY BRACKET
U+007E	~	7e	TILDE
U+007F		7f
U+0080		c2 80
U+0081		c2 81
U+0082		c2 82
U+0083		c2 83
U+0084		c2 84
U+0085		c2 85
U+0086		c2 86
U+0087		c2 87
U+0088		c2 88
U+0089		c2 89
U+008A		c2 8a
U+008B		c2 8b
U+008C		c2 8c
U+008D		c2 8d
U+008E		c2 8e
U+008F		c2 8f
U+0090		c2 90
U+0091		c2 91
U+0092		c2 92
U+0093		c2 93
U+0094		c2 94
U+0095		c2 95
U+0096		c2 96
U+0097		c2 97
U+0098		c2 98
U+0099		c2 99
U+009A		c2 9a
U+009B		c2 9b
U+009C		c2 9c
U+009D		c2 9d
U+009E		c2 9e
U+009F		c2 9f
U+00A0		c2 a0	NO-BREAK SPACE
U+00A1	¡	c2 a1	INVERTED EXCLAMATION MARK
U+00A2	¢	c2 a2	CENT SIGN
U+00A3	£	c2 a3	POUND SIGN
U+00A4	¤	c2 a4	CURRENCY SIGN
U+00A5	¥	c2 a5	YEN SIGN
U+00A6	¦	c2 a6	BROKEN BAR
U+00A7	§	c2 a7	SECTION SIGN
U+00A8	¨	c2 a8	DIAERESIS
U+00A9	©	c2 a9	COPYRIGHT SIGN
U+00AA	ª	c2 aa	FEMININE ORDINAL INDICATOR
U+00AB	«	c2 ab	LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
U+00AC	¬	c2 ac	NOT SIGN
U+00AD		c2 ad	SOFT HYPHEN
U+00AE	®	c2 ae	REGISTERED SIGN
U+00AF	¯	c2 af	MACRON
U+00B0	°	c2 b0	DEGREE SIGN
U+00B1	±	c2 b1	PLUS-MINUS SIGN
U+00B2	²	c2 b2	SUPERSCRIPT TWO
U+00B3	³	c2 b3	SUPERSCRIPT THREE
U+00B4	´	c2 b4	ACUTE ACCENT
U+00B5	µ	c2 b5	MICRO SIGN
U+00B6	¶	c2 b6	PILCROW SIGN
U+00B7	·	c2 b7	MIDDLE DOT
U+00B8	¸	c2 b8	CEDILLA
U+00B9	¹	c2 b9	SUPERSCRIPT ONE
U+00BA	º	c2 ba	MASCULINE ORDINAL INDICATOR
U+00BB	»	c2 bb	RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
U+00BC	¼	c2 bc	VULGAR FRACTION ONE QUARTER
U+00BD	½	c2 bd	VULGAR FRACTION ONE HALF
U+00BE	¾	c2 be	VULGAR FRACTION THREE QUARTERS
U+00BF	¿	c2 bf	INVERTED QUESTION MARK
U+00C0	À	c3 80	LATIN CAPITAL LETTER A WITH GRAVE
U+00C1	Á	c3 81	LATIN CAPITAL LETTER A WITH ACUTE
U+00C2	Â	c3 82	LATIN CAPITAL LETTER A WITH CIRCUMFLEX
U+00C3	Ã	c3 83	LATIN CAPITAL LETTER A WITH TILDE
U+00C4	Ä	c3 84	LATIN CAPITAL LETTER A WITH DIAERESIS
U+00C5	Å	c3 85	LATIN CAPITAL LETTER A WITH RING ABOVE
U+00C6	Æ	c3 86	LATIN CAPITAL LETTER AE
U+00C7	Ç	c3 87	LATIN CAPITAL LETTER C WITH CEDILLA
U+00C8	È	c3 88	LATIN CAPITAL LETTER E WITH GRAVE
U+00C9	É	c3 89	LATIN CAPITAL LETTER E WITH ACUTE
U+00CA	Ê	c3 8a	LATIN CAPITAL LETTER E WITH CIRCUMFLEX
U+00CB	Ë	c3 8b	LATIN CAPITAL LETTER E WITH DIAERESIS
U+00CC	Ì	c3 8c	LATIN CAPITAL LETTER I WITH GRAVE
U+00CD	Í	c3 8d	LATIN CAPITAL LETTER I WITH ACUTE
U+00CE	Î	c3 8e	LATIN CAPITAL LETTER I WITH CIRCUMFLEX
U+00CF	Ï	c3 8f	LATIN CAPITAL LETTER I WITH DIAERESIS
U+00D0	Ð	c3 90	LATIN CAPITAL LETTER ETH
U+00D1	Ñ	c3 91	LATIN CAPITAL LETTER N WITH TILDE
U+00D2	Ò	c3 92	LATIN CAPITAL LETTER O WITH GRAVE
U+00D3	Ó	c3 93	LATIN CAPITAL LETTER O WITH ACUTE
U+00D4	Ô	c3 94	LATIN CAPITAL LETTER O WITH CIRCUMFLEX
U+00D5	Õ	c3 95	LATIN CAPITAL LETTER O WITH TILDE
U+00D6	Ö	c3 96	LATIN CAPITAL LETTER O WITH DIAERESIS
U+00D7	×	c3 97	MULTIPLICATION SIGN
U+00D8	Ø	c3 98	LATIN CAPITAL LETTER O WITH STROKE
U+00D9	Ù	c3 99	LATIN CAPITAL LETTER U WITH GRAVE
U+00DA	Ú	c3 9a	LATIN CAPITAL LETTER U WITH ACUTE
U+00DB	Û	c3 9b	LATIN CAPITAL LETTER U WITH CIRCUMFLEX
U+00DC	Ü	c3 9c	LATIN CAPITAL LETTER U WITH DIAERESIS
U+00DD	Ý	c3 9d	LATIN CAPITAL LETTER Y WITH ACUTE
U+00DE	Þ	c3 9e	LATIN CAPITAL LETTER THORN
U+00DF	ß	c3 9f	LATIN SMALL LETTER SHARP S
U+00E0	à	c3 a0	LATIN SMALL LETTER A WITH GRAVE
U+00E1	á	c3 a1	LATIN SMALL LETTER A WITH ACUTE
U+00E2	â	c3 a2	LATIN SMALL LETTER A WITH CIRCUMFLEX
U+00E3	ã	c3 a3	LATIN SMALL LETTER A WITH TILDE
U+00E4	ä	c3 a4	LATIN SMALL LETTER A WITH DIAERESIS
U+00E5	å	c3 a5	LATIN SMALL LETTER A WITH RING ABOVE
U+00E6	æ	c3 a6	LATIN SMALL LETTER AE
U+00E7	ç	c3 a7	LATIN SMALL LETTER C WITH CEDILLA
U+00E8	è	c3 a8	LATIN SMALL LETTER E WITH GRAVE
U+00E9	é	c3 a9	LATIN SMALL LETTER E WITH ACUTE
U+00EA	ê	c3 aa	LATIN SMALL LETTER E WITH CIRCUMFLEX
U+00EB	ë	c3 ab	LATIN SMALL LETTER E WITH DIAERESIS
U+00EC	ì	c3 ac	LATIN SMALL LETTER I WITH GRAVE
U+00ED	í	c3 ad	LATIN SMALL LETTER I WITH ACUTE
U+00EE	î	c3 ae	LATIN SMALL LETTER I WITH CIRCUMFLEX
U+00EF	ï	c3 af	LATIN SMALL LETTER I WITH DIAERESIS
U+00F0	ð	c3 b0	LATIN SMALL LETTER ETH
U+00F1	ñ	c3 b1	LATIN SMALL LETTER N WITH TILDE
U+00F2	ò	c3 b2	LATIN SMALL LETTER O WITH GRAVE
U+00F3	ó	c3 b3	LATIN SMALL LETTER O WITH ACUTE
U+00F4	ô	c3 b4	LATIN SMALL LETTER O WITH CIRCUMFLEX
U+00F5	õ	c3 b5	LATIN SMALL LETTER O WITH TILDE
U+00F6	ö	c3 b6	LATIN SMALL LETTER O WITH DIAERESIS
U+00F7	÷	c3 b7	DIVISION SIGN
U+00F8	ø	c3 b8	LATIN SMALL LETTER O WITH STROKE
U+00F9	ù	c3 b9	LATIN SMALL LETTER U WITH GRAVE
U+00FA	ú	c3 ba	LATIN SMALL LETTER U WITH ACUTE
U+00FB	û	c3 bb	LATIN SMALL LETTER U WITH CIRCUMFLEX
U+00FC	ü	c3 bc	LATIN SMALL LETTER U WITH DIAERESIS
U+00FD	ý	c3 bd	LATIN SMALL LETTER Y WITH ACUTE
U+00FE	þ	c3 be	LATIN SMALL LETTER THORN
U+00FF	ÿ	c3 bf	LATIN SMALL LETTER Y WITH DIAERESIS

ollex commented 3 years ago

Looking more closely at the UTF2SJIS function I suppose the problem boils down to the mapping between those utf-8 codes to sjis, I will have to find a table where I have a mapping between everything that's larger than 128 from utf-8 to sjis codes to find the correct spot in the .FMT files for each char. That should work. I can probably construct a table with an iconv script if I don't find one. Will let you know how that goes!

nopnop2002 commented 3 years ago

For Japanese

SJIS (= ShiftJIS) is called ja_JP.SJIS (HP-UX) / Ja_JP (AIX) /ja_JP.PCK (Solaris). It is a 2-byte character code set. Bitmap images corresponding to 2 bytes are stored in the SJIS font file.

int lcdDrawUTF8Char(FontxFile fx, uint16_t x,uint16_t y,uint8_t utf8,uint16_t color) {

Convert from utf(3 bytes) to SJIS(2 bytes) SJIS:https://en.wikipedia.org/wiki/Shift_JIS uint16_t UTF2SJIS(uint8_t *utf8) if((cd = iconv_open("sjis","utf-8")) == (iconv_t)-1){

Draw SJIS(2 Bytes) Character lcdDrawSJISChar(fx,x,y,sjis[0],color);

Ger Font pattern from SJIS rc = GetFontx(fx, sjis, fonts, &pw, &ph); // SJIS -> Font pattern

For Germany int lcdDrawUTF8Char(FontxFile fx, uint16_t x,uint16_t y,uint8_t utf8,uint16_t color) {

Convert from utf(3 bytes) to de_AT.ISO8859-1(???)(2 bytes) uint16_t UTF2xxx(uint8_t *utf8)

utf8:U+00A2 UTF2xxx:c2 a2 if((cd = iconv_open("xxx","utf-8")) == (iconv_t)-1){

Draw de_AT.ISO8859-1(???)(2 Bytes) Character lcdDrawxxxxChar(fx,x,y,sjis[0],color);

Ger Font pattern from de_AT.ISO8859-1 rc = GetFontx(fx, sjis, fonts, &pw, &ph); // de_AT.ISO8859-1 -> Font pattern

nopnop2002 commented 3 years ago

If you want to draw LATIN CAPITAL LETTER A WITH GRAVE:

Convert from U*00C0 to 0xc3+0x80.
Gets a bitmap image from 2 bytes(0xc3 0x80).

ollex commented 3 years ago

thanks for your efforts! I will try to get a bit further over the weekend I hope I've understood now with your great explanations!

nopnop2002 commented 3 years ago

@ollex

Added Font File Viewer.

I'm looking for bit map font generator.

[EDIT 1] I found this: http://www.angelcode.com/products/bmfont/

This tool generate tga or dds format file. But i don't know this format. Maybe we can convert from tga/dds to FONTX format.

bitmap_font_generator

ollex commented 3 years ago

Your're awesome! I got so far until now: I can now display Ö etc from a font I included (fcambus/spleen) which is iso_8859-1 if I see it correctly, after simply adding a simple custom UTF2ISO function . That is the good part :) and as that iso and utf-8 are at the same codes for the main chars there is not much complication. The weird part is, I am getting an extra space after each Umlaut. Need to look into why this is the case, I probably have to look into how the position is returned from the drawChar methods.

ollex commented 3 years ago

@ollex

Added Font File Viewer.

I'm looking for bit map font generator.

[EDIT 1] I found this: http://www.angelcode.com/products/bmfont/

This tool generate tga or dds format file. But i don't know this format. Maybe we can convert from tga/dds to FONTX format.

I think your tool is great for this use case, no need to have another one? I took the spleen font and exported it with your tool, this seems to work fine! There is also tools that can convert from ttf to bdf format which your tool can import.

ollex commented 3 years ago

Hi, @nopnop2002 maybe this one fony could work? I did not try it but it sounds like the file formats supported could work?

ollex commented 3 years ago

Anyway again thanks for your great support I will figure out my remaining small glitches they are 100% something I will find in how I am iterating over my input chars.

nopnop2002 commented 3 years ago

If you have time, please tell me how you were able to display your native character set.

Or if i read your code, I think i can understand your procedure.

I would like to include your procedure in my README.

PS) My native langage is not English(Japanese). Therefore, I would be happy if i could include your document in the README as it is. I think it would be useful for many non-English speaking people if you could publish the procedure.

Thank you.

ollex commented 3 years ago

Yes once I have figured out why I am getting the extra space (I think I am still doing something wrong in feeding char by char or by interpreting the chars although I can get the Umlaute etc. to display with an iso .FMT font file) I will post the solution!

ollex commented 3 years ago

for now => I have included a new font file from fcambus/spleen to quickly try out (imported into your tool, exported as fmt) a font including the needed glyphs. I then created a function that just translates with iconv to iso8859-1 although this might even not be needed (need to test), takes the first byte of each letter in my string and sends it to your lcdDrawSJISChar method. That way I am getting ö, Ö, Ä etc to diplay. But.. with an extra space, which I need to find out why still.

nopnop2002 commented 3 years ago

When you check the font with font-viewer, You may be able to find the reason for the extra space.

$ cc -o dump dump.c fontx.c
$ ./dump fontx/your_font_file 2byte_code
PS)
$ ./dump fontx/ILGZ16XB.FNT 0x93FA

ollex commented 3 years ago

maybe the problem lies here. I cannot find with c3 upfront but with the least significant byte only?

nopnop2002 commented 3 years ago

FONTX1624.fnt is Single byte code FONTX file. The font images are stored in files in code order. A Single byte code FONTX file has a code from 0x00 to 0xff. http://elm-chan.org/docs/dosv/fontx_e.html

on the other hand, Double byte code FONTX file has a code from 0x0000 to 0xFFFF.

If you want to use Double byte code FONTX file, It need a code block table. This is because the SJIS codeset is made up of discontinuous code. Code block include start and end code.

If all the character codes are consecutive, one code block is enough.

Double-byte code FONTX files may have a code Flag of 1.

fontx/ILGZ16XB.FNT have 94 code block. Code = 0x93FA is contained in code block # 37.

$ ./dump fontx/ILGZ16XB.FNT 0x93FA
argc=3
fontFile=[/home/nop/ILFONT03/ILGZ16XB.FNT]
Font width=16
Font height=16
Code flag=0 ---> Show inverted value. this mean code flag of file set to 1.
Number of code blocks=94
Block 0 start=8140 end=817e
Block 1 start=8180 end=81fc
Block 2 start=8240 end=827e
Block 3 start=8280 end=82fc
Block 4 start=8340 end=837e
Block 5 start=8380 end=83fc
Block 6 start=8440 end=847e
Block 7 start=8480 end=84fc
Block 8 start=8540 end=857e
Block 9 start=8580 end=85fc
Block 10 start=8640 end=867e
Block 11 start=8680 end=86fc
Block 12 start=8740 end=877e
Block 13 start=8780 end=87fc
Block 14 start=8840 end=887e
Block 15 start=8880 end=88fc
Block 16 start=8940 end=897e
Block 17 start=8980 end=89fc
Block 18 start=8a40 end=8a7e
Block 19 start=8a80 end=8afc
Block 20 start=8b40 end=8b7e
Block 21 start=8b80 end=8bfc
Block 22 start=8c40 end=8c7e
Block 23 start=8c80 end=8cfc
Block 24 start=8d40 end=8d7e
Block 25 start=8d80 end=8dfc
Block 26 start=8e40 end=8e7e
Block 27 start=8e80 end=8efc
Block 28 start=8f40 end=8f7e
Block 29 start=8f80 end=8ffc
Block 30 start=9040 end=907e
Block 31 start=9080 end=90fc
Block 32 start=9140 end=917e
Block 33 start=9180 end=91fc
Block 34 start=9240 end=927e
Block 35 start=9280 end=92fc
Block 36 start=9340 end=937e
Block 37 start=9380 end=93fc ----> 0x93FA is here.
Block 38 start=9440 end=947e
Block 39 start=9480 end=94fc
Block 40 start=9540 end=957e
Block 41 start=9580 end=95fc
Block 42 start=9640 end=967e
Block 43 start=9680 end=96fc
Block 44 start=9740 end=977e
Block 45 start=9780 end=97fc
Block 46 start=9840 end=987e
Block 47 start=9880 end=98fc
Block 48 start=9940 end=997e
Block 49 start=9980 end=99fc
Block 50 start=9a40 end=9a7e
Block 51 start=9a80 end=9afc
Block 52 start=9b40 end=9b7e
Block 53 start=9b80 end=9bfc
Block 54 start=9c40 end=9c7e
Block 55 start=9c80 end=9cfc
Block 56 start=9d40 end=9d7e
Block 57 start=9d80 end=9dfc
Block 58 start=9e40 end=9e7e
Block 59 start=9e80 end=9efc
Block 60 start=9f40 end=9f7e
Block 61 start=9f80 end=9ffc
Block 62 start=e040 end=e07e
Block 63 start=e080 end=e0fc
Block 64 start=e140 end=e17e
Block 65 start=e180 end=e1fc
Block 66 start=e240 end=e27e
Block 67 start=e280 end=e2fc
Block 68 start=e340 end=e37e
Block 69 start=e380 end=e3fc
Block 70 start=e440 end=e47e
Block 71 start=e480 end=e4fc
Block 72 start=e540 end=e57e
Block 73 start=e580 end=e5fc
Block 74 start=e640 end=e67e
Block 75 start=e680 end=e6fc
Block 76 start=e740 end=e77e
Block 77 start=e780 end=e7fc
Block 78 start=e840 end=e87e
Block 79 start=e880 end=e8fc
Block 80 start=e940 end=e97e
Block 81 start=e980 end=e9fc
Block 82 start=ea40 end=ea7e
Block 83 start=ea80 end=eafc
Block 84 start=eb40 end=eb7e
Block 85 start=eb80 end=ebfc
Block 86 start=ec40 end=ec7e
Block 87 start=ec80 end=ecfc
Block 88 start=ed40 end=ed7e
Block 89 start=ed80 end=edfc
Block 90 start=ee40 end=ee7e
Block 91 start=ee80 end=eefc
Block 92 start=ef40 end=ef7e
Block 93 start=ef80 end=effc
character code=0x93FA
code=93fa
GetFontx OK. code=93fa
00................
01................
02...**********...
03...*........*...
04...*........*...
05...*........*...
06...*........*...
07...**********...
08...*........*...
09...*........*...
10...*........*...
11...*........*...
12...**********...
13...*........*...
14................
15................

FONTX

ollex commented 3 years ago

Hm - no the file comes as no ank flag / single byte code fontx file, so that's totally fine. I don't see why it would add a space though because the "linear" search should just work fine without having to look into which block to use.

nopnop2002 commented 3 years ago

fontx font file format is a specification established around 1990.

At that time, displaying Japanese on MS-DOS was extremely difficult, And the storage to save the font file was FloppyDisk(1.2MBytes).

ollex commented 3 years ago

I have an ugly workaround for my case to avoid the extra spaces, as I am looping over the utf8 encoded string char by char, i need to check if I had something that was encoded with more than 1 byte, if so just skip this index in the loop because we're in the second byte. I am pretty sure I won't deliver any 3 or 4 byte codes so this for the moment should suffice. It is ugly though, maybe I can later come up with something nicer. Note that this is in no way a workaround needed because of your lib, it is just for and from my special use case of code that is using your library!

for(i = 0; i < t->valueint; i++) {
if(i > 0) {
if(text->valuestring[i-1] >= 0xc2) {
continue;
}
}

// note that text->valuestring and c->valueint are here because i am using cJSON library to parse JSON encoded input from stdin. the t->valueint is the length of the string
...here I am going on by calling lcdDrawISOChar(fx, xpos, ypos, (unsigned char*) &text->valuestring[i], c->valueint);

ollex commented 3 years ago

without spaces now :)

nopnop2002 commented 3 years ago

I have a question.

Please tell me again how you made FONTX1624.FNT. With easy English please.

ollex commented 3 years ago

Hi, step1) downloaded fonts from https://github.com/fcambus/spleen step2) imported one of them into your fontxedit.exe step3) saved as .fmt file from your fontedit.exe step4) scp FONT1624.fnt pi@ip_address_of_raspi step5) changed tft file => using my own one, where I import FONT1624.fnt instead of your font file

With your tool this was easy :)

nopnop2002 commented 3 years ago

You can generate a double byte code fontx file.

When used with a single-byte FONTX file, German and English can be used at the same time.

// single_byte_font_file: alphanumeric font
// double_byte_font_file: native character set font
Fontx_init(fx, single_byte_font_file, double_byte_font_file);

Now i'm testing.

$ ./dump spl12x24.fnt 0xc396
argc=3
fontFile=[spl12x24.fnt]
Font width=12
Font height=24
Code flag=0
Number of code blocks=1
Block 0 start=c300 end=c645
character code=0xc396
code=c396
GetFontx OK. code=c396
00............
01..**....**..
02..**....**..
03............
04...******...
05..**....**..
06.**......**.
07.**......**.
08.**......**.
09.**......**.
10.**......**.
11.**......**.
12.**......**.
13.**......**.
14.**......**.
15.**......**.
16.**......**.
17..**....**..
18...******...
19............
20............
21............
22............
23............

nopnop2002 commented 3 years ago

I then created a function that just translates with iconv to iso8859-1

I would like to see your customized UTF2SJIS().

ollex commented 3 years ago

which is still in a state where I left much of your initial UTF2SJIS function in there because I wasn't sure if I would need to interpret more than 1 byte. For the moment, I only need the first byte with what I am sending as input

uint16_t UTF2ISO(uint8_t *utf8) {
unsigned char strISO[3] = {0};
unsigned char *pi1 = utf8;
unsigned char **pi2 = &pi1;
unsigned char *po1 = strISO;
unsigned char **po2 = &po1;
size_t ilen = 3;
size_t olen = 2;
iconv_t cd;
uint16_t iso;

if((cd = iconv_open("ISO_8859-1","utf-8")) == (iconv_t)-1){
return 0;
}

iconv(cd,(char**)pi2,&ilen,(char**)po2,&olen);
iconv_close(cd);
iso = strISO[0];

return iso;
}

ollex commented 3 years ago

It is quite probable that I don't even need the iconv call I think the first byte should be sufficient and similar between the iso and utf8 codes. It works right now because I only send char codes up to first byte size and I am using the single byte fontx file

nopnop2002 commented 3 years ago

If there is a local character code between 0xA0 and 0xFF, I think a 1-byte fontx file will suffice.

In Japanese, there are three types of character sets. 1.HANKAKU KATANAKA 2.ZENKAKU KATAKANA 3.KANJI

0xA0 to 0xFF are used in Japanese HANKAKU KATAKANA characters.

Therefore, ZENKAKU KATAKANA and KANJI require a 2-byte code font.

Thank you very much.

ollex commented 3 years ago

Hi, - yes totally true, to support more chars than 0xff no matter how the fontx file will be organized it will need 2 bytes or more.

ollex commented 3 years ago

Hi, I have a question: To display beyond 0xff I would like to try to create a 2byte code font out of the file I used from spleen. Is it possible to export in that format from your fontedit.exe ?

nopnop2002 commented 3 years ago

As mentioned earlier, a 2-byte code font file requires a Code Block.

fontedit.exe does not know the 2-byte code allocation.

fontedit.exe doesn't know how to make a Code Block.

ollex commented 3 years ago

Ok no problem, just a question, how did you "produce"

spl12x24.fnt 0xc396

that font where obviously the Ö is found by looking for the 2byte utf8 value?

nopnop2002 commented 3 years ago

I don't know how to convert a 3-byte UTF code to a 2-byte code in a language other than Japanese.

The 2-byte code(0xc396) is an appropriately assigned value.

GetFontx OK. code=c396
00............
01..**....**..
02..**....**..
03............
04...******...
05..**....**..
06.**......**.
07.**......**.
08.**......**.
09.**......**.
10.**......**.
11.**......**.
12.**......**.
13.**......**.
14.**......**.
15.**......**.
16.**......**.
17..**....**..
18...******...
19............
20............
21............
22............
23............

nopnop2002 commented 3 years ago

For Japanese, the code below returns the exact 2-byte code.

Maybe. .. .. Only Japanese can convert 3-byte UTF code to 2-byte local code.

if((cd = iconv_open("sjis","utf-8")) == (iconv_t)-1){

ollex commented 3 years ago

I've made a version that can display arbitrary glyphs up to 65535 value from utf8. But it's not ready for production yet and a bit wasty on RAM because of the font data (for ease of use I deliver the bitmap pattern in prepared char arrays... will probably change that in the future and once it is in a state that it is worth showing the code I will post)

nopnop2002 / wiringpi-tft-tool

Example on how to get Unicode input chars to work? #7