ismail-yilmaz / Bobcat

A cross-platform terminal emulator, using TerminalCtrl & U++
GNU General Public License v3.0
18 stars 1 forks source link

Feature request: Configurable word selection #17

Closed dolik-rce closed 5 months ago

dolik-rce commented 9 months ago

Currently, double clicking in text selects a word, where word is defined as continuous run of alphanumeric characters or underscore (if I interpret the code in TerminalCtrl::GetWordSelection correctly). That is technically correct, but not always much useful.

Sometimes it would be much more pleasant to use, if the selection was smarter and could recognize bigger blocks of text that form single unit of information. Common use cases (for me) would be to use double click to select and then copy:

I imagine that the exact rules how the double click selection works might be customizable, similar to how Linkifier works. User could configure a regular expresion(s) that would define what is a "word". The default could be the same as current behavior (so something like \b\B*\b or [_[:alnum:]]*).

Note that this is a low priority feature, I can happily live without it. But I believe it is one of those little things, that could make people happy when using this terminal :-)

ismail-yilmaz commented 9 months ago

It would be a nice little feature indeed.

I imagine that the exact rules how the double click selection works might be customizable, similar to how Linkifier works.

It would be very similar and probably even easier. It can be easily implemented as a selection mode in directly Bobcat using the existing tool set of TerminalCtrl (without modifying it).

Added to my TODO list.

ismail-yilmaz commented 8 months ago

Ok,

I have added a custom word selection filter support in TerminalCtrl. I decided to implement it in TerminalCtrl, because it is useful in general, anyway. (Not to mention it was very simple).

Client codes (in this case Bobcat, of course) can now add custom word selection according to their preferences. It can be changed on-the-fly, so more than one filter can be easily added. One cool side of the filter support is it is rather a cell block selection function. So, the client code can inspect the properties of the cell and act accordingly. This means that selecting only, say, underlined/bold/italic text, selecting by text/bg color, and other SGR attributes is possible.

I will add configurable word selection support to Bobcat asap.

ismail-yilmaz commented 8 months ago

Currently, double clicking in text selects a word, where word is defined as continuous run of alphanumeric characters or underscore (if I interpret the code in TerminalCtrl::GetWordSelection correctly). That is technically correct, but not always much useful.

Sometimes it would be much more pleasant to use, if the selection was smarter and could recognize bigger blocks of text that form single unit of information. Common use cases (for me) would be to use double click to select and then copy:

* file paths (contain also slashes and dots)

* links (can contain a lot of different characters)

I imagine that the exact rules how the double click selection works might be customizable, similar to how Linkifier works. User could configure a regular expresion(s) that would define what is a "word". The default could be the same as current behavior (so something like \b\B*\b or [_[:alnum:]]*).

Note that this is a low priority feature, I can happily live without it. But I believe it is one of those little things, that could make people happy when using this terminal :-)

I have implemented the configurable word selection in Bobcat. See commit 8b3b2a2

It is pretty basic. You can set which extra characters to be treated as a part of a word. E.g. "/.-_".

Using regexp as you suggested is also on my list, but at the moment it is not high on the list.

dolik-rce commented 8 months ago

I have implemented the configurable word selection in Bobcat. See commit 8b3b2a2

It is pretty basic. You can set which extra characters to be treated as a part of a word. E.g. "/.-_".

Thank you. I have tried and it solves 95% of the problems I've had with the selection before, which is great.

Using regexp as you suggested is also on my list, but at the moment it is not high on the list.

Maybe I'll get to it sooner than you :-)

ismail-yilmaz commented 8 months ago

@dolik-rce

By the way, while the issue #9 is not resolved yet (changes are still pending), I have added a "selector mode" to TerminalCtrl, which is immediately available in Bobcat (see commit 69efeb1). In selector mode you can navigate and select (text, word, rectangles) text in the terminal's buffer, with keyboard shortcuts. Shortcuts are hard coded at the moment, but they will be configurable soon (I am preparing to make all shortcuts in TerminalCtrl configurable.).

Configurable shortcut:

Keys Description
Shift+Ctrl+X Enter selector mode.
Available hard-coded shortcuts (for the time being): Keys Description
Escape Exit selector mode.
Return Start selection.
Backspace Cancel selection.
Ctrl+C Copy selection.
Ctrl+T Text selection mode.
Ctrl+W Word selection mode.
Ctrl+R Rectangle selection mode.
Up [Arrow key] Move 1 row up.
Down [Arrow key] Move 1 row down.
Left [Arrow key] Move 1 column left.
Right [Arrow key] Move 1 column right.
Shift+Left [Arrow key] Move to the beginning of the row.
Shift+Right [Arrow key] Move to the end of the row.
Home Move to the beginning of the buffer.
End Move to the end of the buffer.
Page up Move 1 page up.
Page down Move 1 page down.
ismail-yilmaz commented 8 months ago

Maybe I'll get to it sooner than you :-)

Good news is that it is now very easy to implement the smart selection. I have made the GetWordSelection method of TerminalCtrl both protected and virtual. All we need to do is override it and scan the line. (In a similar way I did in Finder or Linkifier.

ismail-yilmaz commented 8 months ago

I have implemented the configurable word selection in Bobcat. See commit 8b3b2a2 It is pretty basic. You can set which extra characters to be treated as a part of a word. E.g. "/.-_".

Thank you. I have tried and it solves 95% of the problems I've had with the selection before, which is great.

Using regexp as you suggested is also on my list, but at the moment it is not high on the list.

Maybe I'll get to it sooner than you :-)

"Smart" word selection is implemented. Bobcat will allow two word selection modes: plain & smart. In smart mode, it will first process the regexp pattern, and fallback to plain selection mode on failure. Pattern is profile specific and can be set in the emulation config tab.

See: commit 740e9fd

Please check.

P.s: I could have implemented a list of patterns as we have in linkifier, but IMO that would be an overkill. What do you think?

dolik-rce commented 8 months ago

I tried to test with a very simple configuration for now: screen_1709378377

Interestingly, it sometimes produces inconsistent results. Here is an example: screen_1709378527

In both terminals, I have clicked on the word "main", but in one case the entire file path is selected and in the other only the word. It happens consistently, even if I just type two random filenames manually, so it is not some artifact of git output.

I did look at the algorithm in GetWordSelectionByPattern and I think I understand how it is supposed to works. But I have no idea how it could behave like this :-) Any ideas?

dolik-rce commented 8 months ago

P.s: I could have implemented a list of patterns as we have in linkifier, but IMO that would be an overkill. What do you think?

I agree, it's not really needed. Multiple patterns can be always be joined in two one, if necessary.

ismail-yilmaz commented 8 months ago

I tried to test with a very simple configuration for now: screen_1709378377

Interestingly, it sometimes produces inconsistent results. Here is an example: screen_1709378527

In both terminals, I have clicked on the word "main", but in one case the entire file path is selected and in the other only the word. It happens consistently, even if I just type two random filenames manually, so it is not some artifact of git output.

I did look at the algorithm in GetWordSelectionByPattern and I think I understand how it is supposed to works. But I have no idea how it could behave like this :-) Any ideas?

Yes, I have an idea. I'll check it and return asap.

ismail-yilmaz commented 8 months ago

One problem is non-breaking space character. PCRE treats is as is: a non-breaking space character:

Below screenshot shows a similar effect. Two prompts in the backgound are taken from Bobcat and Gnome Terminal. That is TheIDE, using pcre2 (\S+). On the foreground are Bobcat/Finder and Terminator (a VTE-based terminal), exhibiting similar behavior.

Ekran Görüntüsü - 2024-03-02 15-49-30

Can you hex dump the q and l strings in GetWordSelectionByPattern method? I'd like to see what's different in the line you mentioned.

ismail-yilmaz commented 8 months ago

Ah! Nevermind, I've -hopefully- spotted the problem. Distance between the selection anchor and selection position (ph.x - pl.x) should be at least 1 cell wide. Finder and linkifier doesn't need that delta. That's what I was overlooking. :)

Please check.

dolik-rce commented 8 months ago

Yes, it seems to work correctly now. I'll give it some more testing over next few days, especially with more complicated patterns and bigger buffers.

I'm quite curious how will it behave if there is a long buffer with many matches, since the algorithm iterates through all the matches. It is most probably going to be fast enough on reasonable hardware, but I'm not so sure about older machines and things like raspberyPi. It might be fun to optimize it a bit :-)

ismail-yilmaz commented 8 months ago

Yes, it seems to work correctly now. I'll give it some more testing over next few days, especially with more complicated patterns and bigger buffers.

I'm quite curious how will it behave if there is a long buffer with many matches, since the algorithm iterates through all the matches. It is most probably going to be fast enough on reasonable hardware, but I'm not so sure about older machines and things like raspberyPi. It might be fun to optimize it a bit :-)

I ran a quick test with a file with single line containing 100.000 Bobcat/Bobcat.h entries, on a 1024 lines scrollback buffer, it took 2.8 secs. On a 65536 ln buffer, Bobcat just stalled. :)

  1. Limit the search with the visible buffer.
  2. We can start from the cursor position and expand both up & down 1 rows at a time (as TerminalCtrl does with the default word sel). I think this is the best solution.
  3. Limit the items to be searched & the length of the text to be selected.
dolik-rce commented 8 months ago

I think it would be reasonable to assume that "word" should not span multiple lines. So searching only from previous \n to next \n should be enough. It might still be a lot of text in some cases (e.g. minified JSON), so taking just some limited neighborhood and expand it iteratively might still be necessary.

ismail-yilmaz commented 7 months ago

I think it would be reasonable to assume that "word" should not span multiple lines. So searching only from previous \n to next \n should be enough. It might still be a lot of text in some cases (e.g. minified JSON), so taking just some limited neighborhood and expand it iteratively might still be necessary.

OK,

I have tried to improve the situation. First, both linkifier and smart selection are optimized and should be way faster than before. Second, smart selection no longer scans the whole line if the line is longer than approx (>= 2048) characters. It now iterates and expands the line till it hits the upper limit.

(By the way, I have also added minimal shell integration (OSC 7 protocol) to Bobcat. If your shell is configured correctly via a script, (an example can be found in wezterm's repo and in other modern terminals), shell will notify Bobcat for directory changes and the new terminal will use that path if it is enabled.)