melink14 / rikaikun

rikaikun is a Chrome extension that helps you to read Japanese web pages by showing the reading and English definition of Japanese words when you hover over them.
https://chrome.google.com/webstore/detail/rikaikun/jipdnfibhldikgcjhfnomkfpcebammhp
GNU General Public License v3.0
419 stars 80 forks source link

'Next character' cannot handle punctuation characters #2102

Open ReginaldBuren opened 2 months ago

ReginaldBuren commented 2 months ago

Is your feature request related to a problem? Please describe. I use the keyboard shortcuts for previous character (B), next character (M) and next word (N). They are very convenient. The problem is, when we reach a punctuation character (e.g. 。、「」!), the popup disappears, and I have to reach for the mouse again. Many sentences have multiple commas, so I cannot reach the end of the sentence without the mouse.

Describe the solution you'd like When the user presses (M), rikaikun could jump to the next non-punctuation character. If the immediate next character belongs to (。、「」! etc.), then we can skip two characters, or three, as needed. Similarly for previous character (B) and next word (N).

Describe alternatives you've considered

melink14 commented 1 month ago

Thanks for the report!

I agree that this is a good change and I'll try to get to it sooner rather than later. (Spare time has been short lately :( )

melink14 commented 2 weeks ago

I looked at this a bit to see how easy/hard it was.

Some simple solutions are possible but there are several edge cases that make this untenable. I think we'd need to make something more robust. Here are some issues to think about in the future:

So probably, we'd want to do something special for the keyboard navigation and maybe use custom logic to skip to the next text node if we reach the end of the current one.

Sample code:

function getNextTextNodeUsingTreeWalker(node, offset) {
    // Create a TreeWalker that filters for text nodes
    const walker = document.createTreeWalker(
        document.body,  // Root of the tree to walk
        NodeFilter.SHOW_TEXT  // Show only text nodes
    );

    // Move the walker to the current node
    walker.currentNode = node;

    // If there's remaining text in the current node, return it
    if (offset < node.textContent.length) {
        return { node, offset };
    }

    // Move to the next text node
    const nextTextNode = walker.nextNode();

    if (nextTextNode) {
        return { node: nextTextNode, offset: 0 };
    }

    // If no more text nodes are found, return null
    return null;
}

The final interesting part would be choosing which characters to skip. We could just enumerate common ones, but I think there's logic in the BE that already compares a character to valid unicode ranges; if it's not too complicated we could also use it client side.

Another option would be to try a lookup for each character and if nothing comes back then keep going to the next character until a look up works. This would guarantee that it always was in sync with the dictionary code but not sure if performance would suffer. It's probably fine if it's always only a few characters max we need to skip.