LaurensWeyn / Spark-Reader

A tool to assist non-naitive speakers in reading Japanese
GNU General Public License v3.0
30 stars 7 forks source link

Experimental changes #17

Closed wareya closed 6 years ago

wareya commented 7 years ago

My master branch has these changes right now:

What should be merged? I should clean things up first, but the more I clean up what I add. the more I end up refactoring other parts of speak reader.

LaurensWeyn commented 7 years ago

Sorry for the lack of progress, I've only recently found the time to start working on this again a little.

Edict loading has become such a mess that I haven't bothered touching it since I started noticing the problems. Luckily, that file is in an old format (Edict format 2 or something), but there's a more modern format out there using XML with some extras (like other languages), that this old format is being generated from. Unfortunately that XML file is too huge for any text editor so I may need to work with a smaller version of it... Anyhow, the long term plan, regarding Edict loading, is to have some sort of first time setup where the user ticks which other languages they speak and a dictionary file (in some custom format) is generated from the XML file. This custom format could also increase startup time as I don't need to run multiple regex patterns on the definition of every word in the Japanese language during startup. Though regarding the specific issue of readings vs spellings, arn't the readings also just a specific spelling of the word? Regarding lookup, at least.

'Sticking inside a window' was a feature I was planning on making, I should look into how you did that. Once I get the menu bar working, there should be a 'hook' menu for window and text related hooking options. That's the, plan anyway. The idea is that the window you select is used for multiple things, like the memory based text hooker if I ever finish it, the planned VNDB character name import(assuming the window title is the game's name, otherwise asking the user for it), and sticking to a window/minimising when it's minimised, as you've done.

Deconjugator fixes are certainly welcome; I've been reluctant to touch that part of the code though one day I may add some of my own changes.

For the rest... I'm not sure, I'll need to see how they work. More text rendering options sounds interesting, as does word frequency (presumably it takes this after the user has fixed splitting errors). The blacklist, last time I saw it, I felt like the way it works in regards to data structures could be neater, but it's been so long since I've looked at it. And I still have to figure out the new preferred definition system...

wareya commented 7 years ago

Though regarding the specific issue of readings vs spellings, arn't the readings also just a specific spelling of the word? Regarding lookup, at least.

When looking up words in kana, yes; otherwise, no. Readings can be restricted to specific spellings and vice versa. I haven't changed how lookups work, just added more data structures to the edict definition class and the necessary methods for the frequency list stuff to access it. I think the main bonus is that furigana is prioritized correctly now. I'm not sure whether alternative kanji/readings are listed correctly in the definition list, I don't know which parts of the definition interface those use, if they use the "pile of spellings/readings" then they're wrong.

The code that generates spelling/reading associations is annoying, if you ever make a system where you import edict "offline" (so to speak) it would be a lot simpler since you could format it in a way that makes it easy. This might just be the fact that I hate adding code, even though it's basically all I do.

'Sticking inside a window' was a feature I was planning on making, I should look into how you did that.

I think I accidentally avoided some of your existing windows JNA state because I didn't realize it was there. Aside from that it's very clean.

text rendering options sounds interesting

Once I experienced spark reader with outlines instead of an opaque background I couldn't go back.

as does word frequency (presumably it takes this after the user has fixed splitting errors).

It takes it according to the particular definition the user has displayed right then. The logic is kind of crazy because frequency lists don't use edict definition identities, so it has to try several ways of generating the spelling+reading pair.

EDIT: Oh yeah, I also started making the in-program user dictionary editing work.