peterhinch / micropython-font-to-py

A Python 3 utility to convert fonts to Python source capable of being frozen as bytecode
MIT License
385 stars 69 forks source link

RFC: faster rendering #3

Closed briancappello closed 7 years ago

briancappello commented 7 years ago

Thanks for putting this code together! I learned a lot playing around with it :)

On my esp8266, it was a little too slow for my application, so I spent some time trying to make it faster. And there are good results! Keeping the existing bit glyph storage format, I was able to get it drawing 3x to 4x faster (depending on the characters tested with). The biggest optimization is that the new code assumes it's starting with a blank screen (it only calls framebuf.pixel to turn on pixels, and never to turn them off).

I also added a new storage format to font_to_py, that stores lines instead of pixels. The resulting pyfont files are basically the same size, at most maybe 5% bigger. Where this format really improves things is the rendering speed; on my board it is more than 10x faster than the original code.

This PR isn't quite ready for merging, I don't think, but if it's something you think you'd maybe like to include, I can work on cleaning it up. Mostly the readme files aren't up-to-sync with the code, and well, I changed the public API a bit.

peterhinch commented 7 years ago

It will be a few days before I will get a chance to review this but it sounds interesting., thanks for the PR.

peterhinch commented 7 years ago

You've clearly put a lot of work into this: I'm impressed and think the changes are well worth incorporating.

A consequence of assuming a clear screen is to preclude inverted text (black on white). Perhaps we can live with that, although someone might want to display text on, say, a filled rectangle. Is it worth devising a way of specifying inverted at runtime, so that the assumption for that string is that the screen is white?

I've not yet reviewed or tested the code in detail but I'd appreciate a little explanation regarding the line mapping (ideally via the docs updated to reflect your changes).

I'd like to see font_test.py updated to test line mapping. It needs to be able to handle any file produced by font_to_py.py. I don't know if you've looked at this test program but I found it vital in development as it demonstrates rendering on screen and is therefore device independent. I would like to use this to test the final version. I also think it has a role in demonstrating the library.

A couple of minor details re font_to_py.py:

On the Python font file format I suggest renaming from_bytes() to _from_bytes() to indicate that (like _char_bounds()) this function is for internal use, and placing it immediately below the rest of the code.

I assume that the screen output of font_to_py.py is for debugging purposes. In a final version I think it should produce minimal output along the lines of my original version.

briancappello commented 7 years ago

Awesome, thanks for taking the time to review it! I've pushed a few more commits that (hopefully) address your comments:

A consequence of assuming a clear screen is to preclude inverted text (black on white).

Good point. I added a color parameter to the interface of writer, so that in such a case a user could set the color to 0 instead of the default of 1. (I only have a monochrome display but theoretically it should work with color/greyscale displays too)

I'd appreciate a little explanation regarding the line mapping

Sure. It seemed to me to fit in the DRIVERS.md doc, so that's where I added it, but please let me know if you think it should live elsewhere (or if anything is unclear / not as clear as it could be)

I'd like to see font_test.py updated to test line mapping.

Ah so that's what that was for! (Heh clearly I wasn't using it.) I added support for line mapping to the test_font and test_file methods. (Here you'll notice the line mapped storage format is far less efficient than the bitmapped format... luckily it's hard to tell on a modern machine :) I also added some CLI options to make the script a little easier to use. Please let me know if you think anything else should be added there!

On the Python font file format I suggest renaming from_bytes() to _from_bytes() to indicate that (like _char_bounds()) this function is for internal use, and placing it immediately below the rest of the code.

Good call, fixed. (I wasn't 100% sure where you wanted it moved to; I put it immediately above the _char_bounds function?)

I assume that the screen output of font_to_py.py is for debugging purposes. In a final version I think it should produce minimal output along the lines of my original version.

Woops yes, I removed the calls to display(), and also separated the line mapping debug output from the original display function.

peterhinch commented 7 years ago

GitHub seems to have scrambled my algebra in 1. above and offers no means of editing my post. Font sizes should read: N 8 N 8 + p N * 8 - q.

briancappello commented 7 years ago

Hey sorry for the delay, busy weekend over here.

I suggest abandoning with an error message if the user issues incompatible arguments e.g. Lb, Lx

Good call, added some early exits.

I assume line mapping is compatible with bit reversal - have you tested this?

I'm not sure bit reversal makes any sense in the context of the line mapping format? If i'm understanding the original bit mapped code correctly, it stores one bit per pixel, and the reversal argument changes whether the parsing code should start from the most or least significant digit. This is done on a per byte basis, not a per row (horizontal mapping) or col (vertical mapping) basis. So, for that reason, without changing the storage format, the optimized drawing routines I added for bit mapping only support "normally" mapped vertical bitmapping, and reversed horizontal bitmapping. (Because it reads from memoryview entire rows/cols at once instead of going byte-by-byte). In Python those values come back as one giant base10 integer but what the format actually cares about is whether the individual binary digits are high or low (with the pixel position stored as the "index" into the binary number).

By contrast, in the line mapping format, we care about the actual base10 integer values, because they represent starting coordinates and line lengths. The starting coordinates are relative to (0,0) of the character (not the Writer/screen/framebuffer), and the line lengths are relative to the starting coordinates. All the lines are either horizontal and drawn from the starting coordinates to the right length pixels (for the horizontal line mapping) or vertical and drawn from the starting coordinates down length pixels (for the vertical line mapping). Perhaps if there are any big-endian platforms MicroPython runs on then that kind of bit reversal could make sense to support.

I suggest trying the following test cases where possible on a physical display

OK, I think everything appears to check out correctly. If I understood your equations correctly I chose 23, 24, and 25 size fonts. One thing I did notice is that when the writer's initial x position is set to a number greater than 0, new lines start over at 0 instead of the initial set position, however this behaviour matches the original code. Also, you may have noticed I got rid of the concept of margins - the original code was buggy on my hardware and I couldn't figure out what the intent was (admittedly I didn't put much effort into deciphering it).

peterhinch commented 7 years ago

Another thought on this. Renaming get_ch() to get_char() in the font files has the effect of forcing users to re-create all their font files and to modify their driver code. I think new font files should be a compatible superset of existing ones.

Regarding the x position I was aware of this. I'd intended nonzero x positions for single line strings. But I've no objection to multi-line strings starting at the specified position if you prefer.

I agree that bit reversal only makes sense for byte mapping. It's intended to match display hardware which is so arranged.

briancappello commented 7 years ago

OK reverted that rename

On Fri, Jul 7, 2017 at 3:16 AM, Peter Hinch notifications@github.com wrote:

Another thought on this. Renaming get_ch() to get_char() in the font files has the effect of forcing users to re-create all their font files and to modify their driver code. I think new font files should be a compatible superset of existing ones.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/peterhinch/micropython-font-to-py/pull/3#issuecomment-313606604, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdSLCDFgEIvkPE8r_6CZyx5xnY0_25Bks5sLdtagaJpZM4OExdt .

peterhinch commented 7 years ago

Thanks for that. Another thought occurred to me. The change you made where a subsequent row is replaced by zero if it's identical to the previous one also results in an incompatible font file. I agree it's a clever change and driver support is easily implemented - I wish I'd thought of it!. But I do think font_to_py.py needs to be able to generate font files which are compatible with existing drivers.

I suggest we have a command line option (-o perhaps) to enable files to be created with the incompatible optimisations. Where this is specified I suggest that the "version = '0.2'" string be changed to "version = '0.3'". This approach brings the undoubted benefits of the changes you've made without affecting existing drivers or their documentation (the same command-line invocation will produce the same files as before). This is readily testable against my existing version using a file comparison utility.

You might like to check for incompatible command line options such as -L without -o (obviously the old format didn't support line mapping) and to adapt the help information to suit. I'd be grateful if you could review the changes you've made to identify any incompatibilities I've missed.

If you have another idea on how to address this issue by all means raise it.

briancappello commented 7 years ago

Regarding the zero represents the previous row optimization, I actually couldn't think of how to implement that for the existing bitmap formats[0], so that size optimization is unfortunately only relevant for the new line mapping format. (Or perhaps fortunately, depending on your perspective - font_to_py still generates the same bitmapped font files as before. Rendering speed ups for the bitmapped font format all come from the changes to the Writer class.)

Regardless I think you raise a valid point about the font file version number, definitely that should be bumped for any line mapped font files. For bitmapped font files, the only difference is that there is no longer a guarantee of a sequential charset. Do you think the version number should still also be bumped for bitmapped fonts?

[0] How do you distinguish between the first 8 pixels (the first byte) just having no data to draw, vs having a special meaning? Initially I'd though using -1 would make more sense than 0, but memoryview is unsigned, so that was a dead end.

peterhinch commented 7 years ago

If I understand you correctly the implications of a non-sequential charset are that:

  1. Fonts created with old command lines and accessed by old drivers will still work because of the index.
  2. A file comparison of an old file and a new one generated from the same font and with the same command line may fail.

If I'm right about 1. the version number should be kept the same. If I'm right about 2. this should be documented (perhaps in a footnote).

In my view an optimisation which primarily makes a reduction to flash file size is not a high priority: Flash is much more plentiful than RAM. It's entirely up to you if you want to do this. If you do go that route in V0.3 files consider this (apologies if you know this already).

There is no way of defining a byte with a special meaning without disallowing that byte as a character. -1 as a byte is 0xff so you'd have to disallow that. As I understand it a memoryview just provides a window onto a bytearray or bytes object and its sole purpose is to avoid needless copying. It's up to the user code to interpret the byte values as signed or unsigned.

I'm agnostic as to whether the optimisation justifies disallowing a character so long as the char chosen is seldom used, documented, and flagged up by the code if inadvertently specified. I leave that to you.

To add to my comment previously on incompatible command line options, another way to deal with this is to have the -L flag imply -o and state this on the -L help line. This avoids having to test it in code. All assuming we have a -o option: as I say it's up to you whether you want to optimise existing file formats. But please leave the binary format unchanged: I think its application is very limited and there is no merit in changing that.

Actually you've exposed an incompatibility in micropython. Under MicroPython:

MicroPython v1.8.7-893-g46b849a on 2017-06-22; linux version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> a = bytearray([0, 1, -1])
>>> a[2]
255
>>> 

Under CPython 3.4.3:

>>> a = bytearray([0, 1, -1])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: byte must be in range(0, 256)

I'll raise an issue.

peterhinch commented 7 years ago

Sorry about the delay but I've been busy and also had a short holiday. I'm still struggling with the compatibility issue here. I converted the attached font using the same command line

$ ./font_to_py.py FreeSans.ttf 23 freesans.py

with my original version and with your latest code, then compared the two. The following differences concern me.

  1. The max_width value is different.
  2. The size of the font data is different. Not necessarily a problem but it puzzles me.
  3. We have lost the max_ch() and min_ch() functions.
  4. We seem to have lost the ability to define an error character which is the glyph displayed when an out of range value is presented to get_ch().

Whatever enhancements are available in V0.3 font files, I think it is necessary that V0.2 files are strictly compatible with those produced by my version. Otherwise, for myself and other users, drivers (in my case including two entire GUI's) will have to be re-tested and (potentially) docs amended.

By compatible I mean that invoking font_to_py with any valid set of arguments will produce a font file functionally identical to that produced by the old version when run with the same args. That doesn't mean the Python code must be identical character for character but the modules must have the same functions and the same inputs should produce the same outputs.

The ability to emit extended ASCII characters is great, but I can't see why we've lost the functionality in 3. and 4. above.

The alternative is for you to promote your fork which I will happily reference in my docs. Users then have a choice between compatibility and the very useful enhancements you have produced. Perhaps our solutions have diverged sufficiently that promoting your fork is the best answer. What's your view?

freesans.zip

peterhinch commented 7 years ago

I have updated README.md to include a recommendation to consider your fork for new projects.

One difficulty with open source projects is that it's impossible to estimate how many users your code actually has; font_to_py.py is crucial to a number of projects most of which are on GitHub. These would need modifying, presenting their users with a problem should they need to upgrade for (say) a bugfix release. Changing multiple font files is a chore, even if you have a record of how you built them initially.

So I concluded that the loss of font file compatibility was too high a price to pay for including your excellent enhancements. I should have thought this through much earlier in our discussion. My only excuse is preoccupation with other work. Please accept my apologies for this.