ipython / xkcd-font

The xkcd font
https://cdn.rawgit.com/ipython/xkcd-font/master/preview.html
Other
1.06k stars 48 forks source link

Programatically creating the font from the handwriting sample #16

Closed pelson closed 7 years ago

pelson commented 7 years ago

screen shot 2017-04-21 at 14 07 37


Transcribed below:

I'm opening this issue because I've done some work with the handwriting dataset that Randall produced, and am able to programatically produce a pleasing fontfile from it.

First and foremost, I wanted to get your feedback on the approach that I've taken. I'd be more than happy to discuss options for bringing this work into this repo (and the logistics/code organisation etc.).

A whole saga has been written up @ https://pelson.github.io/2017/xkcd_font/ with way more detail than is necessary for the everyday interested party. My intention was to document the whole process so that the font is entirely reproducible from the source document (the handwriting sample). Because of the detail, I'm not widely advertising the article's existence at this point, but will do in the next few weeks/months once we have decided what is best to do with its findings.

Any input on next steps would be greatly appreciated.

pelson commented 7 years ago

P.S. My spelling is terrible, and there are a number of typos that need fixing in my writeup (I'm on it 😄 ).

Just in case CC @rgbkrk & @spu7nik as the most recent contributors.

rgbkrk commented 7 years ago

Well that's awesome Phil!

takluyver commented 7 years ago

Nice! I really enjoyed the detailed write up, thanks. :-)

HughP commented 7 years ago

Nice. I'm going to work my way through the write up. I've been looking for a tutorial on how to do this kind of stuff.

rgbkrk commented 7 years ago

That was a fabulous article, I loved the ligature notes.

Carreau commented 7 years ago

Sweet ! On the part where you map the segmented drawing to the actual text would it be possible to classify the bounding boxes by "width" to autodetect the characters pairs (or trio) ? Might be a bit annoying because of narrow characters but this can likely can be automatized.

cc @mpacer as well who will enjoy this.

Carreau commented 7 years ago

Also, now that we have the glyphs, can we map them on all the existing XKCD comics to "learn" the kerning?

pelson commented 7 years ago

Also, now that we have the glyphs, can we map them on all the existing XKCD comics to "learn" the kerning?

Definitely plausible, but also possibly in to the territory of diminishing returns... 😉

On the part where you map the segmented drawing to the actual text would it be possible to classify the bounding boxes by "width" to autodetect the characters pairs (or trio) ?

You mean, rather than my defining the paragraph complete with ligatures, I just define the paragraph and auto-detect the wide images as ligatures? I think that is definitely possible, and think it might work well iteratively (as in, while there are more characters than images, pick the widest image as a ligature and figure out which characters they were [somehow]).

With respect to integrating the work into this repo (if there is interest in doing so), what would you consider the primary generator of the font be? The code + handwriting sample? The PPM glyphs? The SVG glyphs? The sfd font file? No matter what level we choose, should we include each of these steps in the repo, and if so, how do we manage the fact that they are all build artefacts?

Carreau commented 7 years ago

as in, while there are more characters than images, pick the widest image as a ligature and figure out which characters they were [somehow]

You might need to take the "average" bounding box lenght, and say "oh, my bounding box is about 3 times the average, it's likely a 3 char ligature". The position of the BB on the x axis may also give you hints of the width of the characters being ligatured and wether they are narrow or wide.

You might be able to also say :

ligatured characters have length \Sum w_i

Then optimize for the positions of the ligatures to minimize the distance between the BB width and the lenght of the w_i/ Sum w_i.

You can use simulated aneling but it may be too much.

With respect to integrating the work into this repo (if there is interest in doing so), what would you consider the primary generator of the font be? The code + handwriting sample? The PPM glyphs? The SVG glyphs? The sfd font file? No matter what level we choose, should we include each of these steps in the repo, and if so, how do we manage the fact that they are all build artefacts?

I would minimize the chances of having out of date artifacts and have people modify autogenerated files. Maybe we can get it to build on travis...

pelson commented 7 years ago

I'm up for having it build on travis. If we are doing that though, we will also need to have some integration test(s) to confirm the font continues to look correct (not a biggy, just making it clear).

damianavila commented 7 years ago

Lovely saga @pelson!!

Because of the detail, I'm not widely advertising the article's existence at this point, but will do in the next few weeks/months once we have decided what is best to do with its findings.

Can we share it now, am I right?

rgbkrk commented 7 years ago

When are you comfortable with this post being widely dispersed? We just brought this up at the dev meetings and I personally would love to share it more widely.

rgbkrk commented 7 years ago

As for what to do, I think it would be great to incorporate your new glyphs into the current font.

pelson commented 7 years ago

I've opened a PR. Hopefully that might lead to a happy ending for to the saga 😉 .

takluyver commented 7 years ago

Closing as that PR was merged; thanks @pelson !

pelson commented 7 years ago

Cool. I'm going to do a bit more tidy work on the code in the repo, then I'll publicise the blog post in the next few days/week or so to see if we can get a burst of potential collaborators.

damianavila commented 7 years ago

Let us know when you publicize it so we can can spread the word :wink:

mpacer commented 7 years ago

I'm really excited to figure out how to work on the kerning pairs, @suchow and I have a paper from last years cogsci about using crowdsourcing and transmission chains to find people's inductive biases toward kerning pair expectations(https://cocosci.berkeley.edu/papers/zerothprinciples.pdf). It should be able to be modified to allow designing kerning pairs as well as just extracting the bias.

Also I'd just love to play around with the each of the pieces of the code. I'm still really impressed at your persistence in this as I've tried to do similar things and always got stymied in just setting up the system to work at all.

Any word on getting conda-forge versions of the various packages you use?

On Wed, Jun 21, 2017 at 03:47 Damian Avila notifications@github.com wrote:

Let us know when you publicize it so we can can spread the word 😉

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ipython/xkcd-font/issues/16#issuecomment-310041212, or mute the thread https://github.com/notifications/unsubscribe-auth/ACXg6OJFtXwHSRiIhmj-fJb5_8d42U24ks5sGPSsgaJpZM4ND159 .

pelson commented 7 years ago

Any word on getting conda-forge versions of the various packages you use?

I haven't yet gone down that road, though it is the obvious next step in terms of the software stack. In truth, this has been a hobby project that I got so far down that I couldn't bear to see just dropped, but I don't have the capacity to maintain a fully functional conda-forge build of fontforge. The compromise I made was to have a pre-built (albeit unreproducible) docker image with the tools necessary for the job. I'm sure you've seen that you can currently get hold of that on dockerhub pelson/fontbuilder.

Hopefully, the work that I've done here has reduced the barrier to entry for improvements such as the one you pointed out. We will at some point get to the interesting situation where "improvements" will be highly subjective, and I recommend we find some way to meet that head-on (perhaps a BDOFFL? [benign dictator of fonts for life], or a rate-my-changes voting policy)