FooSoft / yomichan-import

External dictionary importer for Yomichan.
https://foosoft.net/projects/yomichan-import/
MIT License
83 stars 23 forks source link

Various 広辞苑 bugs #28

Open Thermospore opened 3 years ago

Thermospore commented 3 years ago
  1. over a thousand entries with �� or as the headword
  2. some headwords need this boxed A thing removed image
  3. same bug as number 3 in issue #27 image
  4. there are a lot of broken looking entries with a ○ at the start of the headword
Thermospore commented 3 years ago

I'm new to the EPWING format. Guessing number 1 is caused by those charming image fonts 🙂 I'm willing to help map them out. Looks like 広辞苑 has a shit ton though. Maybe bulk OCR, then manually confirm one by one? image

FooSoft commented 3 years ago

Ah yes, that would be the image fonts. The problem is they don't necessarily have to correspond to things you would find in fonts (most are normal characters, but there are random exceptions for symbols). The process of mapping them out often includes finding reasonable substitutions for glyphs that don't exist. Help mapping the missing ones would be much appreciated!