stephenmk / Jitendex

A free, offline, and openly licensed Japanese-to-English dictionary. Updates weekly!
https://jitendex.org
Creative Commons Attribution Share Alike 4.0 International
248 stars 2 forks source link

Addition of pitch accent information #38

Open stephenmk opened 8 months ago

stephenmk commented 8 months ago

There exist free sources of pitch accent data, e.g. from Wiktionary and kanjium. Pitch accents can vary depending on the context in which a word is used, so they would technically need to be associated with each dictionary entry on a sense-by-sense basis.

"Sweat of the brow" copyright claims are not valid in my legal jurisdiction. Large collections of factual data (e.g. telephone numbers) are not inherently copyrightable; it is the presentation and organization of the data that makes them protected by copyright. With that in mind, it doesn't seem like it would be a problem to collect this pitch accent data wholesale from a variety of Japanese dictionaries. I believe incorporating and organizing this data into Jitendex would easily qualify as a new creative work (the bar is very low for this qualification).

Many people don't seem to trust the accuracy of the Kanjium data. Unlike Wiktionary and kanjium, I could include specific source citations in the presentation of this pitch accent data. This would provide users with some assurance that the data isn't complete nonsense.