skishore / makemeahanzi

Free, open-source Chinese character data
https://www.skishore.me/makemeahanzi/
Other
1.87k stars 466 forks source link

Noisy medians on some characters #90

Open mdkess opened 3 years ago

mdkess commented 3 years ago

I've noticed on some characters, the medians get very noisy. It seems they snap to the edge at points instead of going through the middle of the character. Some examples are 亾愿割哙喔

I understand these are generated from Voronoi graphs of the strokes - any thoughts on how I could go about cleaning this up? I found the script, but it's unclear to me how to run it to regenerate data.

Screenshot 2020-12-15 214425 Screenshot 2020-12-15 214145 Screenshot 2020-12-15 214224 Screenshot 2020-12-15 214120
mdkess commented 3 years ago

I think I found a good solution - got the tool branch up and running to debug. It seems the Voronoi generation in these cases failed at 16, but then 64 was too noisy.

So I did two things:

  1. Changed the approximation to for (let approximation of [16, 24, 32, 40, 48, 56, 64]) {
  2. Increased the tolerance to simplify to 15 pixels from 4.

The next thing I'm trying to figure out is how to get rid of the little hook at the top of the 人 characters and others - I think it's just because of guessing where the stroke starts.

Screen Shot 2021-01-02 at 12 25 48 AM Screen Shot 2021-01-02 at 12 26 47 AM
mdkess commented 3 years ago

@skishore - any thoughts on this change? I have exported the graphics.txt for my own purposes, but would you like a PR with these changes too? I've spot checked a bunch of characters, and I don't see any errors. It does loose some fidelity but it might make the data easier to work with.

QAbot-zh commented 2 years ago

@skishore - any thoughts on this change? I have exported the graphics.txt for my own purposes, but would you like a PR with these changes too? I've spot checked a bunch of characters, and I don't see any errors. It does loose some fidelity but it might make the data easier to work with.

@mdkess I think your idea is very good. Could you provide the modified graphics.txt file?