lvgl / lv_font_conv

Converts TTF/WOFF fonts to compact bitmap format
https://lvgl.io/tools/fontconverter
MIT License
178 stars 77 forks source link

lvgl writer issues #10

Closed kisvegabor closed 5 years ago

kisvegabor commented 5 years ago
  1. The lines of uncompressed bitmaps should be byte aligned
  2. Minor typos in the generated file. See this
  3. Kern pair descriptor is updated to accept pairs with 1 and 2 byte size
  4. It seems the bitmaps are stored from bottom to top (should be top->bottom):

Explanations for bitmap direction: '!' is (here):

0xd0, 0xff, 0xfc,

In binary:

1101 0000
1111 1111
1111 1100

As '!' has box_w = 2 (here) I suppose it is:

XX
 X

XX
XX
XX
XX
XX
XX
XX

So it seems upside down.

puzrin commented 5 years ago

Ok.

You are right, images seems reversed (not as i expected too). Seems that should not prevent from writing/checking code (a bit strange, but not fatal). Let me know when you expect no more changes in this issue, i will fix all at once.

puzrin commented 5 years ago

The lines of uncompressed bitmaps should be byte aligned

Should they have updated bitmap width accordingly or only aligning injections on each line end?

kisvegabor commented 5 years ago

Should they have updated bitmap width accordingly or only aligning injections on each line end?

If bitmap width means box_w then it should be the real width of the character (e.g. 2 for '!')


A minor update: https://github.com/littlevgl/lvgl/commit/2dfaef466804e0af3c525b503d0897338521ba18

puzrin commented 5 years ago

Is current compressed fonts enougth to write code and check no more changes required?

Compressor test samples available in https://github.com/littlevgl/lv_font_conv/blob/master/test/font/test_compress.js. If they pass, everything will work as needed.

kisvegabor commented 5 years ago

Is current compressed fonts enougth to write code and check no more changes required?

Could you provide commands which creates the following output?

With these, I can implement and test all the features.

I've just noticed that bpp=8 is not supported. Wasn't it required with current TTF renderer?

puzrin commented 5 years ago

Could you provide commands which creates the following output?

Only "tiny". "full" are reserved for very special case, because spec should not have bad logical holes.

./lv_font_conv.js --font ./node_modules/roboto-fontface/fonts/roboto/Roboto-Regular.woff --size 16 --bpp 3 -o ./lvgl_font.txt --format lvgl --symbols AB

Dump font with 2 chars, and vary length between. "AB" should give F0 tiny. "AЯ" should give sparse tiny (because of big hole, but still better than 2 F0 tables).

I've just noticed that bpp=8 is not supported. Wasn't it required with current TTF renderer?

bpp > 3 should not give visual difference. bpp = 4 left as reserve (useless?). bpp=8 will be 100% useless garbage, confusing user (IMO). If you have reason, why it should exists - let me know. Adding it is not difficult.

puzrin commented 5 years ago

https://github.com/littlevgl/lv_font_conv/commit/3acf1b8146ba5675567a580f24f718a9dee21810

Updated kerning pairs + all minor fixes (typos).

NOT updated (postponed):


Dump with kerning F0:

 ./lv_font_conv.js --font ./node_modules/roboto-fontface/fonts/roboto/Roboto-Regular.woff --size 16 --bpp 3 -o ./lvgl_font.txt --format lvgl -r 0x20-0x40

Increase range to dump with classes (F3)

 ./lv_font_conv.js --font ./node_modules/roboto-fontface/fonts/roboto/Roboto-Regular.woff --size 16 --bpp 3 -o ./lvgl_font.txt --format lvgl -r 0x20-0x200
kisvegabor commented 5 years ago

Thank you for the commands and the update.

I've tested the new version and found two things:

  1. for the bitmap array LV_ATTRIBUTE_LARGE_CONST is missing. See this comment.
  2. In lv_font_fmt_txt_kern_pair_t there is .glyph_ids_size = f.glyphIdFormat instead of .glyph_ids_size = 1
puzrin commented 5 years ago

https://github.com/littlevgl/lv_font_conv/commit/c97a514e25afa9c3bb7e97c3a8e5ce66b8ddc268 should be fixed now.

kisvegabor commented 5 years ago

Thank you. I'm implementing the class support now. Interestingly with Roboto-Regular ttf always pairs are generated but with woff I got classes.

There is one minor typo here: lv_font_kern_classes_fmt_txt_t -> lv_font_fmt_txt_kern_classes_t

kisvegabor commented 5 years ago

Some other typos: https://github.com/littlevgl/lvgl/commit/a94866a115957c2965334c79eba611c826ef011b

puzrin commented 5 years ago
puzrin commented 5 years ago

Interestingly with Roboto-Regular ttf always pairs are generated but with woff I got classes.

Check twice your range param. In theory result should be the same. Woff is TTF + zip (by design). Though your TTF source may be changed somehow by third party, because official repo has woff/woff2 only.

kisvegabor commented 5 years ago

Thank you for the updates.

Though your TTF source may be changed somehow by third party, because official repo has woff/woff2 only.

Probably that's the case.

I can continue the integration on Sunday.

puzrin commented 5 years ago

Could you explain how lines align should help? Especially for --bpp 3. As far as i understand, any "unmatched" screen and font bits resolution will require shifting magic (not allow to compose multiple pixels in one step).

Where do you plan to get performance benefit from lines align?

kisvegabor commented 5 years ago

The line align is for compatibility with the current letter draw algorithm. Before updating the format to line aligned, tomorrow I'll check how difficult it is to update the drawing algorithm to support bitstream. However, bpp=3 is a different story. It's more complex to mask 3 bit from a bitstream because the data can cross byte boundaries. That's why it is not supported right now. bpp=3 can work for compressed formats because it needs preprocessing anyway and it can be converted to bpp=4. But for the uncompressed format, I'm not planning to support bpp=3 for now.

puzrin commented 5 years ago

I can return error for --bpp 3 --no-compression. Will be waiting for your verdict - is lines align required or not.

PS. In ideal world... if someone find time to rewrite all engine code... it may be better to return compressed_bitstream_statet_t instead of bitmap. Then "composer" will have no intermediate repack & memory write/read, and decompression cost become very low (comparable with what you do now).

PPS. But prefilter (lines XOR) should be disabled to allow bit-by-bit processing. Or more complex logic should be used to make post-filter streamed.

kisvegabor commented 5 years ago

I can return error for --bpp 3 --no-compression. Will be waiting for your verdict - is lines align required or not.

It would be great.

PS. In ideal world... if someone find time to rewrite all engine code... it may be better to return compressed_bitstream_statet_t instead of bitmap. Then "composer" will have no intermediate repack & memory write/read, and decompression cost become very low (comparable with what you do now).

I don't understand. Can you explain it in more detail?


Finally, I updated the letter drawing to work with bitstream so, fortunately, the byte aligned line ending is not required.

It looks like this: image

What is working:

What is still missing:

Sidenote: While I used the font converter I found that writing -r only once would be more convenient. E.g. -r 0x20-0x70 -r 0x200 -> -r 0x20-0x7F 0x200. What do you think? (Not a big thing, just asking)

puzrin commented 5 years ago

I don't understand. Can you explain it in more detail?

Current API can cause unnecessary internal data transforms (with compressor, for example), because return bitmap as array. Formally, it's enougth to return interface to extract bitmap. That's not trivial, but possible.

Also, format of returned bitmap from get_glyph() does NOT guarantee convenience of next operation (composing bitmap and framebuffer's content).

Usually, that's implemented via streams, with pull-push patterns. Since C has no classes, replacement will be "stream state". Now 2 stream types required:

I think, with proper implementation, raw stream become not needed at all.

compressed formats. To understand how it works in details can you write an example with a few bytes?

I don't understand you. Source is available, with detailed comments. Tests are available too, with 100% coverage and all samples. What else to you need?

Sidenote: While I used the font converter I found that writing -r only once would be more convenient. E.g. -r 0x20-0x70 -r 0x200 -> -r 0x20-0x7F 0x200. What do you think? (Not a big thing, just asking)

I think 2 things:

puzrin commented 5 years ago

Is anything left except vertical mirroring of glyph bitmaps?

kisvegabor commented 5 years ago

Current API can cause unnecessary internal data transforms (with compressor, for example), because return bitmap as array. Formally, it's enougth to return interface to extract bitmap. That's not trivial, but possible.

Getting clearer, thank you. It seems using streams really might have some advantages but I wouldn't start such a rework now, so close to the release.

I don't understand you. Source is available, with detailed comments. Tests are available too, with 100% coverage and all samples. What else to you need?

I've read Spec/Compression, checked the source code and the tests but still couldn't figure out how to get 0xE7, 0x80 from 0xFF, 0xF1, 0xFF here.

What I have so far:

0xFF, 0xF1, 0xFF with bpp:3
111 111 111 111 000 111 111 111
Should be:
0b111 x 4      +      0b000 x 1     +     0b111 x 3
111 11110             000                 111 1110

1111 1110    0001 1111    10
0xFE         0x1F         0x20 

Besides, you wrote: "At least 1 repeat requires to fall into RLE mode". In this case how to decide if the next bit(s) is the next value or the repeat count of the previous. E.g. from 111 110: does it mean 0b111 x 2 or 0b111, 0b110?

It's quite difficult to reverse engineer the format from source code. Documentation with examples and explanations would help to understand how it works. Anyway, not supporting compression is not a blocking issue. If you don't want to spend time on writing doc about it, it's not a problem.

you already can type multiple options in comma-separated format: -r 0x1F450,0x1F451-0x1F470

I missed the ,. It's perfect.

Is anything left except vertical mirroring of glyph bitmaps?

According to what I could test so far only the vertical mirroring is missing.


I started to work on the online converter but even with your original example I got this exception: image

Shall we open a new issue for it?

puzrin commented 5 years ago

I started to work on the online converter but even with your original example I got this exception:

In this case how to decide if the next bit(s) is the next value or the repeat count of the previous.

Interesting question :). I need time to meditate over

puzrin commented 5 years ago

how to decide if the next bit(s) is the next value or the repeat count of the previous

First, this happens not random, and you know exactly current state. Let's start from begining: decompressor is in "pass through" mode (transfer all pixels as is, until find repetition).

So, it does this:

  1. Fetch next pixel (bpp bits)
  2. Compare with previous pixel
  3. If NOT equal - pass to output as is and continue (with next+1 pixel)
  4. If equal - current data is still pixel, and next will be repetitions flags (up to 11 of 1 and then 6 bits of counter)

Every time you know exactly the type of next data.

Is this clear?

puzrin commented 5 years ago

I've read Spec/Compression , checked the source code and the tests but still couldn't figure out how to get 0xE7, 0x80 from 0xFF, 0xF1, 0xFF here .

What I have so far:

0xFF, 0xF1, 0xFF with bpp:3
111 111 111 111 000 111 111 111
Should be:
0b111 x 4      +      0b000 x 1     +     0b111 x 3
111 11110             000                 111 1110

1111 1110    0001 1111    10
0xFE         0x1F         0x20 

Besides, you wrote: "At least 1 repeat requires to fall into RLE mode". In this case how to decide if the next bit(s) is the next value or the repeat count of the previous. E.g. from 111 110: does it mean 0b111 x 2 or 0b111, 0b110?

Input data is list of "pixels" with values in low bits. Note, we work not with blobs, and internal representation as "Array of pixels" is convenient for our needs (both in convertor and in tests). With { bpp: 3 } only 3 lowest bits will be used. So, after drop of unused bits, real data will be:

0xFF, 0xF1, 0xFF

=>

111, 001, 111

Compressor should pass it as is (no repeats), then align tail with zeros. As a result, we will get:

0xE7, 0x80

Bingo!

kisvegabor commented 5 years ago

I started to work on the online converter but even with your original example I got this exception: 995ffdf

Works now, thank you. The only remaining things are:

First, this happens not random, and you know exactly current state. Let's start from begining: decompressor is in "pass through" mode (transfer all pixels as is, until find repetition).

Still not clear. Let's take an example: 101110111 ... It could be: 101 110 111 ... Or 101 110(x2) 111 ... -> 101 101 111 ...

Input data is list of "pixels" with values in low bits. Note, we work not with blobs, and internal representation as "Array of pixels" is convenient for our needs (both in convertor and in tests). With { bpp: 3 } only 3 lowest bits will be used. So, after drop of unused bits, real data will be:

Clear, thank you.

puzrin commented 5 years ago

The only remaining things are:

* how to represent multiple ranges.
  E.g. how to pass: `0x20-0x70, 0x120, 0x125, 0x1500=>0x550, 0x2300-0x2500=>0x700`

I'd suggest add before this line:

console.log(JSON.stringify(args, null, 4));
// OR
console.log(args); // but it will not dump deep properties

Then run convertor with different options and check output. Except some small garbage properties with null values, output will reflect what you should pass to input.

  • no-compress, no-prefilter, no-kerning are true/false?

Yes

Still not clear. Let's take an example: 101110111 ... It could be: 101 110 111 ... Or 101 110(x2) 111 ... -> 101 101 111 ...

To become repetition (fall into RLE state), sequence MUST have 2 same entries:

101 110 110 1 1 1 1 0 => 101 110(2+4)

Until decompressor's input has no repeats - it stays in "pass through" mode.

IMO spec is correct, but as author, i'm unable to imagine all reading difficulties of other people. If you think it can be improved - do PR please, to simplify life of next readers. And of cause, i'm always ready to answer any your questions.

I did not included decompressor's algorythm because implementation may be very different, depending on required features (streaming, optimizations, platform). For example, convertor has "zero" requirements and uses the most "stupid" implementation to simplify packer.

puzrin commented 5 years ago

To become repetition (fall into RLE state), sequence MUST have 2 same entries:

101 110 110 1 1 1 1 0 => 101 110(2+4)

Until decompressor's input has no repeats - it stays in "pass through" mode.

Some side notes. When i designed compressor, i had a choice:

  1. Require stop-bit (0) after each non-repeating pixel.
  2. Require repeat of XX pixels to flag RLE mode and make less than XX repeat sequences not compressible.

Due nature of our data, (2) with double repeat was selected as most effective.

kisvegabor commented 5 years ago

Decompression Probably I wasn't precise enough. 111 110 is compressed data. How to interpret 110? Does it mean 111 x2 or simply 110 value? I see no way to distinguish the two.

Vertical mirror Can you estimate when can you fix it?

Range input console.log(args); helped. I saw that every range is represented by 3 numbers:

  1. range start
  2. range end
  3. range remap base? (just guessing)

-r 0x20=>0x30 wasn't working:

lv_font_conv.js: error: argument "-r/--range": Invalid range value: 0x20=
embeddedt commented 5 years ago

@kisvegabor Are you running it in a shell? It might be interpreting the > as a redirect symbol. Maybe you should try quoting "0x20=>0x30".

puzrin commented 5 years ago

Decompression Probably I wasn't precise enough. 111 110 is compressed data. How to interpret 110? Does it mean 111 x2 or simply 110 value? I see no way to distinguish the two.

That depends on previous entry. You can NOT start from random position.

[111] 111 110  => then 110 means 2 additional repeats of 111
[other] 111 110 => then 110 means 110

Decompressor state is always clear if you start from the begining.

Vertical mirror Can you estimate when can you fix it?

Give me couple of days please. That needs more deep dive into sources for correct fix. It's preferable to fix all non trivial things all at once, than switch from other tasks every time.

Range input ...

  1. range remap base? (just guessing)

Yes.

-r 0x20=>0x30 wasn't working:

lv_font_conv.js: error: argument "-r/--range": Invalid range value: 0x20=

quoting should help:

./lv_font_conv.js --font ./node_modules/roboto-fontface/fonts/roboto/Roboto-Regular.woff --size 16 --bpp 3 -o ./lvgl_font.txt --format lvgl -r '0x20-0x200=>0x30'

I can update readme & help (add missed quotes) or change separator if you have better ideas. Let me know.

kisvegabor commented 5 years ago

That depends on previous entry. You can NOT start from random position.

If I understood things well 111 111 110 can't be in the compressed stream because it should be 111 1110.

Let's say 111 110 ... is the very first values of the compressed data. Now, what does 110 mean?

Give me couple of days please. That needs more deep dive into sources for correct fix. It's preferable to fix all non trivial things all at once, than switch from other tasks every time.

I haven't found any other issue so far.

Are you running it in a shell? It might be interpreting the > as a redirect symbol. Maybe you should try quoting "0x20=>0x30".

It solved it. Thank you. I created a simple version of the converter: https://littlevgl.com/font_conv_new

I can update readme & help (add missed quotes) or change separator if you have better ideas. Let me know.

I think updating the readme is enough.


The build fails now due to some formatting issues. How can I run the tests offline to fix them?

puzrin commented 5 years ago

I have some notes about your commits.

1)

https://travis-ci.org/littlevgl/lv_font_conv/builds

Please make sure you signed into https://travis-ci.com/. Then you will get emails on broken tests. Also, run npm test for sure, before commits.

Tests are still broken. That should never happen in shared branches, because affect other devs.

2)

A bit noisy commit history. Probably, it could be better to make such things (web) in separate branch, and merge final squashed result in the end.

Also, when you develop feature in your custom branch, it's better to use git --rebase master or git --rebase dev to realign, before merge your commints.

3)

It may be better to use dev for staging. In this case if something goes bad, we have chance to fix it without breaking master.


Right now i'd suggest to clone master into custom branch and do hard reset to previous state. Then apply changes to dev in convenient form.

If you have no time - then just fix tests in custom branch and merge into dev as is.

puzrin commented 5 years ago

Let's say 111 110 ... is the very first values of the compressed data. Now, what does 110 mean?

Since no repeat of 111 was detected, continue in "pass through" mode, and 110 means new pixel with value 110.

The build fails now due to some formatting issues. How can I run the tests offline to fix them?

npm test.

I know, you use different merge policy, and can not insist on following "git flow" process in this repo (nobody like to change his habbits). But please... make sure "main" branches are never broken. Several people (you, me, Alex) can pull from those anytime, and broken tests are not convenient.

puzrin commented 5 years ago

I see your copy appeared in web branch. Can i "hard reset" master back until nobody pulled (your commits there will disappear)?

kisvegabor commented 5 years ago

I see your copy appeared in web branch. Can i "hard reset" master back until nobody pulled (your commits there will disappear)?

Sure, sorry for the inconvenience. Just please merge the current master back to web to be sure nothing has lost.

Since no repeat of 111 was detected, continue in "pass through" mode, and 110 means new pixel with value 110.

111 111 111 is compressed to 111 111 110? I thought it should be 111 1110.

puzrin commented 5 years ago

Just please merge the current master back to web to be sure nothing has lost.

https://github.com/littlevgl/lv_font_conv/branches

I did full backup to master_backup, to not intrude into your work.

111 111 111 is compressed to 111 111 110?

Almost yes. Into 111 111 1 0 (raw + raw + repeat + stop).

I thought it should be 111 1110.

In this case it will be impossible to decide, how to detect RLE mode start in decompressor. I tried to explain this here https://github.com/littlevgl/lv_font_conv/issues/10#issuecomment-498117331, probably not enougth well.

puzrin commented 5 years ago

Should i block --bpp 3 --no-compression pair or your new code can eat result well? We discussed this before you dropped lines align requirement.

kisvegabor commented 5 years ago

Almost yes. Into 111 111 1 0 (raw + raw + repeat + stop).

Got it, thank you!

Should i block --bpp 3 --no-compression pair or your new code can eat result well? We discussed this before you dropped lines align requirement.

Yes, block it, please.

puzrin commented 5 years ago

Anything else?

puzrin commented 5 years ago

@kisvegabor Seems you did some C template fixes in web branch. I'd suggest to rearrange things a bit:

kisvegabor commented 5 years ago

I've fixed the formatting issue. Actually, I've done all the things I meant to do for the online converter and the writer. Shall I merge all the commits into a single one and merge to master?

puzrin commented 5 years ago

Use dev, not master, give us a chance for fix, if something goes wrong in big change :)

  1. I'd suggest to rebase against dev HEAD first.
  2. If possible, rearrange into 2 commits:
    • "lvgl: update writer templates"
    • "web: prepare output for web site".
  3. Merge to dev or kick me for review, and i will merge if everything ok.

(2) if that's too difficult - commit as single "lvgl: update writer templates and prepare web output for website".

kisvegabor commented 5 years ago

(2) if that's too difficult - commit as single "lvgl: update writer templates and prepare web output for website".

If you don't mind I'd rather choose this. :) I did the rebase and squash in web-rebase branch.

EDIT: I will test the updates (bitmap mirror) later today.

puzrin commented 5 years ago

If you don't mind I'd rather choose this. :)

No problem, i understand :). Merged your changes into dev & master.

Do you need all those branches or i can remove those?

kisvegabor commented 5 years ago

Thank you!

They can be removed.

kisvegabor commented 5 years ago

I've found some issue with the latest version. See lv_font_roboto_12.c.zip

  1. "space" character now has a bitmap. It wasn't earlier but box_w was set to 0
  2. There are some unnecessary 0x00s in the bitmaps. Look '!' for example.
  3. Almost all glyphs have ofs_x = -1
  4. adv_w is smaller than box_w. E.g. in case of the last character: adv_w = 165 (165 /16 =10.3 ), .box_w = 12. Therefore the right side of the glyphs are cropped Here there is a huge difference:
    {.bitmap_index = 1184, .adv_w = 108, .box_h = 16, .box_w = 9, .ofs_x = -1, .ofs_y = -4},

    adv_w = 108/16 = 6,75 while box_w = 9

puzrin commented 5 years ago

Ups... sorry... try now

kisvegabor commented 5 years ago

Looks good, thank you.

kisvegabor commented 5 years ago

A minor thing: This variable is set to zero. It should be 1 or 2. The most simple solution could be .glyph_ids_size = sizeof(kern_pair_glyph_ids[0])

puzrin commented 5 years ago

https://github.com/littlevgl/lv_font_conv/blob/master/doc/font_spec.md

It's 0/1 in spec. Could you update at your side to not diverge?