0xbad1d3a5 / Kaku

画 - Japanese OCR Dictionary
https://kaku.fuwafuwa.ca/
BSD 3-Clause "New" or "Revised" License
203 stars 36 forks source link

Make words written in kana show up #38

Open Artikash opened 2 years ago

Artikash commented 2 years ago

A big improvement to #24: I added dictionary entries that will be found when looking up a word written in kana. I think this isn't an ideal solution (I made this mostly to use personally as a stop gap) but I made the PR just in case this is the direction you want to go.

Dictionary was created by exporting the previous version to CSV (delimiter |) and running this python script to generate a new CSV to import back into the database:

output = []
for row in open("export.csv", encoding="utf8").readlines():
    fields = row.strip().split("|")
    fields[1] = str(len(output) + 1)
    output.append("|".join(fields))
    for reading in fields[7].split(","):
        if len(reading) > 0:
            output.append(f"{fields[0]}|{len(output) + 1}|{reading.strip()}|{fields[3]}|{fields[4]}|0|{fields[6]}|{fields[2]}")

open("import.csv", "w", encoding="utf8").write("\n".join(output))
Artikash commented 1 year ago

Whoops, looks like something went wrong while uploading the new dictionary.