atom / find-and-replace

Find and replace in a single buffer and in the project
MIT License
242 stars 196 forks source link

No support for lookbehind #571

Closed ghost closed 5 years ago

ghost commented 8 years ago

Implementing JavaScript's native RegExp engine for a find/replace tool in a text editor seems like a dramatic oversight.

As find-and-replace leans on JS's RegExp, of course the following find expressions is invalid

(?<=hello)world
(?<!hello)world

Errors

Invalid regular expression: /(?<=hello)world/: Invalid group
Invalid regular expression: /(?<!hello)world/: Invalid group

This module is unusable without such features

ErikCorryGoogle commented 7 years ago

If it spins forever you may have written a regexp that takes forever. This is quite easy to do. Or you may have hit a bug. Can you share the regexp in question? There are people in this very thread who have tried the flag and it worked for them.

As for why it was not part of ES6, this is because there was not agreement on what form it should take. Perl-style fixed width lookbehinds had been proposed and I was not very interested in that happening. In the end I helped push the current full lookbehind proposal through so that we would not end up stuck with the fixed width version which has caused much gnashing of teeth in perl. To get something on the TC39 standard you need a champion who wants to write the spec and argue for it in committee, and these days you also need to actually implement it. Sometimes this takes time.

How do people feel about inline flags in regexps? eg /(foo(?i)bar)/ would match foobar and fooBAR but not Foobar because the /i flag only goes from the (?i) to the close parenthesis. Is this something people miss?

jeancroy commented 7 years ago

How do people feel about inline flags in regexps? eg /(foo(?i)bar)/ would match foobar and fooBAR but not Foobar because the /i flag only goes from the (?i) to the close parenthesis. Is this something people miss?

I'm not sure I'd have much use for it. But if a single syntax is adopted I like the modifier on non-capturing group syntax. /(foo(?i:bar))/ . It seems .Net handle it (actually both) and I like how the scope is well defined.

whoinoi commented 6 years ago

Is there any update on this? I found, that the following regular expression also doesn't work in Find in Projects: (?<!\$)\$(?!\$)

It says "Invalid regular expression /(?"

fiapps commented 6 years ago

In the interest of producing a concrete requirement, I (like many) require lookbehind, and in particular I want them to work for searches of the whole project. I would also point out that this was the requirement expressed in the original title of this issue: "positive and negative lookbehind (lookaround)." It only became a vague "better" due to a later edit.

Using Atom 1.25.0, with atom --js-flags="--harmony_regexp_lookbehind" to enable the experimental feature, I can search a single file quickly for (?<!{)foobar\b, but searching the entire project (23 files totaling less than 13000 lines) either takes an extremely long time or hangs, because after more than an hour, the spinner still said 0 files had been searched.

Over two years have passed since the announcement that lookbehind support was coming to V8, and we still don't have a way to reliably use lookbehind to search a whole project. Therefore, I do not think waiting for V8 to solve the problem is an adequate solution.

ghost commented 6 years ago

Sorry for the subjective title in the first place! Still waiting on lookaround in Atom...

ErikCorryGoogle commented 6 years ago

Is this really the regexp you used to test? If searching 13000 lines takes more than an hour then there's no way that's a V8 regexp performance issue. You probably need to file a new bug. Any way to work out which version of V8 this is?

I just tested this on V8 version 6.2 and the two regexps

/(?<!})foobar\b/

and

/foobar\b/

are exactly the same speed, about 14 million lines per second.

There is no reason for Atom to delay switching this on, that I can see. The proposal is finished: https://github.com/tc39/proposals/blob/master/finished-proposals.md

ErikCorryGoogle commented 6 years ago

Atom has had lookahead since forever, so the new title is not very accurate.

fiapps commented 6 years ago

I actually used a very similar regex with another word in place of foobar. However, testing (?<!{)foobar\b I get the same results (now with Atom 1.25.1): I can search a single file almost instantly, but a search of the whole project hangs or takes an extremely long time (update: after 4 hours it had still searched 0 files). I don't know whether the bug is in V8 or Atom. I haven't done any Atom development, so I don't know how to find out where it hangs, but it looks like jinglesthula reported the same issue above.

As for the version of V8, I see that the Atom 1.25.0 release notes report that Electron was upgraded to 1.7.11, and the Electron 1.7.0 release notes report that V8 was upgraded to 5.8.283.38.

ErikCorryGoogle commented 6 years ago

V8 version 5.8 is now over a year old, from March 20 2017: https://v8project.blogspot.dk/2017/03/v8-release-58.html

Unfortunately the performance fix was done on March 28 2017, so it has not got through to Atom yet.

Nevertheless there must be a different issue here, since the improvement was only by a factor of 2-3, and you are seeing a factor of at least 1 million slowdown.

mrmass commented 5 years ago

In the meantime I still get invalid regular expression on lookbehind syntax... Were there any plans to fix this issue?

billybooth commented 5 years ago

Right, is there any traction on this in the release version? I skimmed the thread, but I'd prefer to have negative lookbehind with poor performance than not at all.