tatuylonen / wiktextract

Wiktionary dump file parser and multilingual data extractor
Other
749 stars 82 forks source link

[zh] extract Hiragana data from headword HTML tags #614

Closed xxyzz closed 2 months ago

xxyzz commented 2 months ago

Some Japanese header line templates put the strong tag inside another span tag.