pulsar-edit / pulsar

A Community-led Hyper-Hackable Text Editor
https://pulsar-edit.dev
Other
3.33k stars 140 forks source link

(find-and-replace) (whole word selection) Allow non-word to non-word boundaries #987

Open icecream17 opened 6 months ago

icecream17 commented 6 months ago

Identify the Bug

986

Description of the Change

Edited 2024-6-22 x2

What is "whole word"?

There's a feature in find-and-replace called "whole word". For example, ctrl+f i will match every single i in the code:

//   v
function* countTo2() {
//          v      v      v
   for (let i = 0; i < 2; i++) {
//           v
      yield (i + 1)
   }
}

But if you turn on "whole word", it only matches the i that form a whole word. Since the i in function is only part of a word, not a whole word, turning on "whole word" will now select something more useful:

//   x (not selected!)
function* countTo2() {
//          v      v      v
   for (let i = 0; i < 2; i++) {
//           v
      yield (i + 1)
   }
}

Bug

How is whole-word detected? Basically we match a "word boundary": a transition from a non-word to word character, or from a word to non-word character.

But what if the selection is say, $wha?

let $wha = 2

Clearly, $wha is a whole word. Except it's not, because at the start we are going from non-word (space) to non-word (dollar), i.e. not a word boundary.

Change fix

Allow (non-word character --> non-word character) boundaries

Alternate Designs

Possible Drawbacks

"whole word" might still be inaccurate in cases like $ in $$; might break people's workflow

Verification Process

Release Notes

(find-and-replace) (whole word selection) Allow non-word to non-word boundaries

savetheclocktower commented 6 months ago

Ooh. This looks promising, but I'd love to see some new specs so we can figure out exactly what this will and won't do.

savetheclocktower commented 6 months ago

Also: Chrome has supported lookbehinds in regular expressions since version 62, so we could try that instead.

icecream17 commented 6 months ago

There's no testing for whether it works at the start or end of a file now that I think about it Future me experimentation: https://regex101.com/r/Bt7r1y/2

Sanity check: (https://devina.io/redos-checker) image

icecream17 commented 4 months ago

Note to self, for faster further investigation: errors:

FindView
  finding
    when whole-word search is enabled
      it finds the whole words even when the word starts or ends with a non-word character
        Expected { start : { row : 0, column : 0 }, end : { row : 0, column : 0 } } to equal [ [ 2, 5 ], [ 2, 10 ] ].
          at jasmine.Spec.<anonymous> (find-view-spec.js:468:49)
        Expected { start : { row : 0, column : 0 }, end : { row : 0, column : 0 } } to equal [ [ 2, 0 ], [ 2, 6 ] ].
          at jasmine.Spec.<anonymous> (find-view-spec.js:472:49)
      it doesn't highlight the search inside words (non-word character at start)
        Expected object with length 0 to have length 1
          at jasmine.Spec.<anonymous> (find-view-spec.js:479:64)
      it doesn't highlight the search inside words (non-word character at end)
        Expected object with length 0 to have length 1
          at jasmine.Spec.<anonymous> (find-view-spec.js:485:61)
        Expected object with length 0 to have length 1
          at jasmine.Spec.<anonymous> (find-view-spec.js:486:64)
ProjectFindView (ripgrep=false)
  finding
    whole word
      it toggles whole word option via an event and finds files matching the pattern
        Expected /(?<=^|\W)wholeword(?=$|\W)/gim to equal /\bwholeword\b/gim.
          at jasmine.Spec.<anonymous> (project-find-view-spec.js:692:60)
      it toggles whole word option via a button and finds files matching the pattern
        Expected /(?<=^|\W)wholeword(?=$|\W)/gim to equal /\bwholeword\b/gim.
          at jasmine.Spec.<anonymous> (project-find-view-spec.js:702:60)
ProjectFindView (ripgrep=true)
  finding
    whole word
      it toggles whole word option via an event and finds files matching the pattern
        Expected /(?<=^|\W)wholeword(?=$|\W)/gim to equal /\bwholeword\b/gim.
          at jasmine.Spec.<anonymous> (project-find-view-spec.js:692:60)
      it toggles whole word option via a button and finds files matching the pattern
        Expected /(?<=^|\W)wholeword(?=$|\W)/gim to equal /\bwholeword\b/gim.
          at jasmine.Spec.<anonymous> (project-find-view-spec.js:702:60)

The last 4 are easy to fix, but I'm very concerned about the first two. The result changed from an array to an object, whose match starts at 0? So the first thing I need to do is actually understand what the code is doing

icecream17 commented 4 months ago

Apparently, the word boundary \b thing matches something of zero-length: the position between a word and non-word character.

And importantly, a word boundary can be of either the form word nonword OR nonword word

So what this pr is trying to do, is allow nonword nonword at the start or end of find-selection. In other words, disable the "word" in "word selection" if there are non-word characters. This is potentially rather unintuitive and might break someone's use case, but for now the sample size is 1 (me).

icecream17 commented 4 months ago

image

Actually, I think the purpose of "whole word" search is for there to specifically be non-word characters surrounding the selection, i.e. the selection itself forms a "word" (which may or may not include non-word characters). Therefore, simulating \b results in "incorrect" matches like in the above picture, and ignores "correct" matches like on line 1. That means what the current code tries to do is correct.

This is kinda breaking so feedback from anyone who uses this option would be nice

icecream17 commented 4 months ago

OH the code is doing exactly what it does, it's just the tests are wrong image

-word is always preceded by a non-word character so Of course the test fails to find any match


Edit so I don't ping: wow, the find-and-replace tests just happen to be the longest package tests...

icecream17 commented 4 months ago

The (1) test failures now are equally baffling...

FindView
  finding
    when whole-word search is enabled
      it finds the whole words even when the word starts or ends with a non-word character
// regex101 shows 8 matches but here it's 0?

        Expected object with length 0 to have length 7
          at jasmine.Spec.<anonymous> (find-view-spec.js:467:61)
        Expected object with length 0 to have length 1
          at jasmine.Spec.<anonymous> (find-view-spec.js:468:64)

// a test in between passed

// this test is basically the same but it's not? The return of the object instead of array....

        Expected { start : { row : 4, column : 0 }, end : { row : 4, column : 6 } } to equal [ [ 2, 0 ], [ 2, 6 ] ].
          at jasmine.Spec.<anonymous> (find-view-spec.js:476:49)

Edit: Oh, I forgot to set the cursor position and to run the command for the first test