skishore / makemeahanzi

Free, open-source Chinese character data
https://www.skishore.me/makemeahanzi/
Other
1.83k stars 465 forks source link

How to use the tool branch #37

Closed olivernyc closed 6 years ago

olivernyc commented 6 years ago

Hey skishore!

I can't figure out how to use the tool branch. I'm able to run the local server, but all I see is this page:

screen shot 2018-06-01 at 9 17 23 pm

Are there any instructions? Please forgive me if I'm missing something obvious. I want to add data for some characters that are missing from graphics.txt, but I'm stuck at this screen! :)

skishore commented 6 years ago

I'm sorry, there are no instructions, because that tool is really rough around the edges! It kind of works but is pretty tailored to my workflow when I was doing the first 10k characters.

If you want to add <= 100 characters, it's probably easiest to just give me a list. I'll give a quick rundown of how to use the tool in case you're curious, though.

To get started, you would have to install both Meteor (looks like you're past that) and the mongodump / mongorestore tools, open up the console, and type "Meteor.call('restore');", which will reload the database from the backup in the repository.

The "A" and "D" keys step back and forth between characters, while "shift-A" and "shift-D" step back and forth between incomplete characters. You can use the URL bar directly to add a new character (type it after the hash mark).

The "W" and "S" keys step up and down between different "stages". I organized the data generation workflow into a series of heuristic / ML algorithms, each of does one analysis and which can be overridden manually. Using the tool effectively requires knowing how these algorithms work and fail, though, since if you know one stage is probably going to be right, you can quickly scan it and move on.

As you can tell the tool is a bit of a mess! That's why I don't advertise it more. The ROI on doing more work on it is limited though because this dataset is a finite and it's mostly built.

olivernyc commented 6 years ago

Thanks for the help, I'm making progress!

I've loaded the database and started adding a character (萊) but the edit analysis step throws TypeError: undefined is not an object (evaluating 'glyph.stages.strokes.corrected').

When I select a compound and add a radical (艹) nothing happens and I get TypeError: undefined is not an object (evaluating 'stage.radical').

Here's the list of characters I want to add: 萊 鵝 釐 樁 荊 駝 嵐 綏 蝦 慴 鵝 蝦

I also want fix the stroke order for (swap fifth and sixth strokes).

And here's a list of all the missing characters I've identified: https://gist.github.com/olivernyc/3b4e579f69837b6ce5acdf09e94c35d7

skishore commented 6 years ago

Ah - there was a bug in the strokes stage which I just fixed. I amended the change into the tool branch so you should run git checkout tool && git fetch && git reset --hard origin/tool.

That's quite a lot of characters to add! Let me tell you a few more things about the later stages.

On the analysis stage, for traditional characters, the heuristics will almost always infer the formation from the simplified character (e.g. "pictophonetic, where radical A (base meaning) provides the meaning while radical B provides the pronunciation"). You can just check that it's filled in and move on.

On the order stage, there are actually two things to do. First off, you'll want to get the stroke order correct by dragging and dropping the strokes on the left (and possibly reversing some strokes). In addition, we track which strokes come from which components, which you can edit by clicking on the strokes on the right to cycle through component colors.

The components are listed to the left of the character itself. Note that if the same component appears multiple times in a character, it's still important to match each stroke to the instance of that component it comes from. Also, note that in some characters, a stroke doesn't come from some base component - in that case, just cycle it to the gray color.

On the order stage, if the component matching is all correct, then the stroke order is very likely to be right too (because it'll inherit the stroke order in each component and use simple rules to determine the order between components).

Here's an example where I actually have to make changes on the order stage. It's a common one, because of the two forms of the grass radical. I started out with 萊, where only three of the grass radical strokes were matched: https://www.dropbox.com/s/9r2zl1s7lexzzno/Screenshot%202018-06-02%2021.52.56.png?dl=0

To fix it, I clicked on the gray stroke to make it blue, then moved it into position and sorted the strokes of the grass radical correctly: https://www.dropbox.com/s/5gjb4k5wn3uqq9s/Screenshot%202018-06-02%2021.56.58.png?dl=0

skishore commented 6 years ago

Oh, two other important things!

olivernyc commented 6 years ago

Awesome! I was able to add the characters I needed: https://github.com/skishore/makemeahanzi/pull/38

I had to make a few corrections, but most characters were recognized perfectly. You've created an amazing tool!

Assuming the PR looks good, what are the next steps to generate a new graphics.txt file?

skishore commented 6 years ago

The next step is to in the console run:

Meteor.call('export');
// Check the Meteor logs and wait for dictionary.txt and graphics.txt to be ready
Meteor.call('exportSVGs');
// Check the Meteor logs and wait for .svgs to be ready

Then switch to the to master branch, copy the new .txt files, rm -r svgs && mv .svgs svgs, and commit the result. Because the .txt files have one character per line and the svgs have one per file, running git diff --stat should cleanly show your changes.

It would actually be better to send that pull request before I approve #38, since the diff of the database backup doesn't really show what changed.

skishore commented 6 years ago

Looks great, thanks for adding those characters!

olivernyc commented 6 years ago

Great, thanks for the help!

donly commented 3 years ago

open up the console, and type "Meteor.call('restore');",

Hi, I'm a newbie of metero, can everybody tell me what the console refer please?

donly commented 3 years ago

open up the console, and type "Meteor.call('restore');",

Hi, I'm a newbie of metero, can everybody tell me what the console refer please?

Ok, I find the way. I can do that using meteor run then in other window run meteor shell.