'Next character' cannot handle punctuation characters

ReginaldBuren commented 5 months ago

Is your feature request related to a problem? Please describe. I use the keyboard shortcuts for previous character (B), next character (M) and next word (N). They are very convenient. The problem is, when we reach a punctuation character (e.g. 。、「」！), the popup disappears, and I have to reach for the mouse again. Many sentences have multiple commas, so I cannot reach the end of the sentence without the mouse.

Describe the solution you'd like When the user presses (M), rikaikun could jump to the next non-punctuation character. If the immediate next character belongs to (。、「」！ etc.), then we can skip two characters, or three, as needed. Similarly for previous character (B) and next word (N).

Describe alternatives you've considered

Adding the punctuation characters to the dictionary. This way, the popup can stay visible between sentences, since the lookup is successful (I assume the current behaviour is that the popup disappears when the lookup fails, since there is no word beginning with 。、etc.).
Caret browsing. This is a feature in Chrome and other browsers, where we have a blinking cursor on the text of any web page, and we move around with the arrow keys. Maybe we can press a key to do a lookup on the current caret position, instead of hovering with the mouse. Then, the caret position can jump to the end of the currently-highlighted word.

melink14 commented 3 months ago

Thanks for the report!

I agree that this is a good change and I'll try to get to it sooner rather than later. (Spare time has been short lately :( )

melink14 commented 3 months ago

I looked at this a bit to see how easy/hard it was.

Some simple solutions are possible but there are several edge cases that make this untenable. I think we'd need to make something more robust. Here are some issues to think about in the future:

There is logic for skipping ascii spaces and a couple other key codes, so I tried adding a bunch of Japanese punction and it works but it turns out the logic is shared by the mouse logic and it's pretty weird when you hover over a period and it popups over over the next sentence. We'd want to separate the logic.
The list of characters is pretty arbitrary, I just used what I found from a random wikipedia article but I'm sure there's more.
In addition to punctionation, non Japanese letters and numbers also break the flow (numbers are pretty common in wikipedia).
The current feature only looks as far as the current text node, in simple cases that's find but if the text has things like links or <b> tags, it will cause it to stop early no matter what.

So probably, we'd want to do something special for the keyboard navigation and maybe use custom logic to skip to the next text node if we reach the end of the current one.

Sample code:

function getNextTextNodeUsingTreeWalker(node, offset) {
    // Create a TreeWalker that filters for text nodes
    const walker = document.createTreeWalker(
        document.body,  // Root of the tree to walk
        NodeFilter.SHOW_TEXT  // Show only text nodes
    );

    // Move the walker to the current node
    walker.currentNode = node;

    // If there's remaining text in the current node, return it
    if (offset < node.textContent.length) {
        return { node, offset };
    }

    // Move to the next text node
    const nextTextNode = walker.nextNode();

    if (nextTextNode) {
        return { node: nextTextNode, offset: 0 };
    }

    // If no more text nodes are found, return null
    return null;
}

The final interesting part would be choosing which characters to skip. We could just enumerate common ones, but I think there's logic in the BE that already compares a character to valid unicode ranges; if it's not too complicated we could also use it client side.

Another option would be to try a lookup for each character and if nothing comes back then keep going to the next character until a look up works. This would guarantee that it always was in sync with the dictionary code but not sure if performance would suffer. It's probably fine if it's always only a few characters max we need to skip.

melink14 / rikaikun

'Next character' cannot handle punctuation characters #2102