neovim / neovim

Vim-fork focused on extensibility and usability
https://neovim.io
Other
81.8k stars 5.6k forks source link

Feature request be able to use "modern flavors of Regular Expressions" #15762

Closed stephane-archer closed 2 years ago

stephane-archer commented 2 years ago

Feature already in Vim?

No response

Feature description

neovim is using yet another flavor of regular expressions that don't look anything like the extended or Perl regexp. It has even a "very magic" variant.

What a pain to learn and memories what ripgrep sed awk and neovim use. Things will be just better if they all use Perl regexp (or anything they agree on).

Is there any valid reason (except back-compatibility) to use vim regexp over the Perl version?

jamessan commented 2 years ago

Duplicate of #3208

seandewar commented 2 years ago

Also, do note that Vim regex is incredibly powerful and has editor-focused features that aren't available elsewhere. (https://github.com/neovim/neovim/issues/14442#issuecomment-826542674 for examples)

jubilatious1 commented 2 years ago

Perhaps the OP (@stephane-archer) means Perl6 Regular Expressions (aka PSIX/Raku), not PCRE (i.e. Perl5-based)?

[ICYMI, Perl6 was renamed "Raku" in 2019]:

• The PSIX dialect is currently used only in the Perl 6 programming language (now known as "Raku"), but is significant because it is the first major regex dialect to break away from the deep family resemblances shared by all other major dialects. • PSIX (aka "Raku dialect") was designed to make better use of the limited number of available metacharacters (i.e. punctuation and symbols). • The new dialect attempts to use metacharacters more consistently and predictably, to distribute them more “fairly” (i.e. shorter metasyntaxes for more frequently used constructs), and thereby to enhance the overall readability of its regexes.

Above quoted from "Everything You Know About Regexes Is Wrong" https://slides.yowconference.com/yowwest2015/Conway-EverythingYouKnowAboutRegexesIsWrong.pdf Copyright © Thoughtstream Pty Ltd, 2014-2015 http://damian.conway.org

Also see: https://docs.raku.org/language/regexes https://docs.raku.org/language/regexes-best-practices https://raku.org/

kaiuri commented 2 years ago

@stephane-archer

I am currently working on that and have a prototype already packaged into an app, which you can pass some PCRE compliant regex and get a regular expression that's compatible with Vim's new NFA Regex Engine, more tests are needed but it seems to work. Feel free to drop in and give insights or help out on it.

jubilatious1 commented 3 months ago

Can this Issue be merged into #3208 ?

FYI, the PDF linked above has a list of popular Regex Engines, in historical order:

0   Designator  |   Full name   |   Dialect used in...
1   BRE |     POSIX 1003.2 (section 2.8) basic regular expressions  | ed, sed, grep,
2   ERE |     GNU extended regular expressions  | egrep, gawk, Notepad++, vile, Tcl (with extra extensions)
3   EMACS     | Emacs/Elisp regular expressions | emacs
4   VIM |     Vim/Vimscript regular expressions | vim
5   PCRE      | Perl-compatible regular expressions | The PCRE library, the .NET runtime, Apache, BBedit, C#, Delphi, Java (subset), JavaScript (subset), PHP, Perl 5 (with extra extensions), PowerShell, Python, R, Ruby, SAS, TextMate, Ultraedit, VB.NET
6   PSIX (RAKU) | Perl 6 Regular Expressions  |  Perl 6 (renamed Raku in 2019)

https://slides.yowconference.com/yowwest2015/Conway-EverythingYouKnowAboutRegexesIsWrong.pdf


Proposal:

I'd like to cut-to-the-chase here and suggest that Neovim adopt a "FUTURE-PROOF" strategy for selecting Regex Engines:

I believe this proposal adds a "compatibility layer" that could make it easier for End Users to adapt a wide-variety of code to the Neovim ecosystem. For example, we might be able to get to the point where End Users could use RX1 to run grep regexes, RX2 to run gawk regexes, RX3 to run Emacs regexes, etc.

Thoughts, criticisms? Thank you.

justinmk commented 3 months ago

make it easier for End Users to adapt a wide-variety of code to the Neovim ecosystem

Making it "easy" is not in scope until it becomes clear that the cost is worth it. Making it possible is already proposed in https://github.com/neovim/neovim/issues/3208 , via transpilation.

jubilatious1 commented 3 months ago

Making it "easy" is not in scope until it becomes clear that the cost is worth it.

Absolutely! However, I've often wondered if some End Users shied away from adopting Vim as their daily driver because they didn't want to learn the Vim regex engine--for example--heavy users of command-line grep, egrep, gawk, etc. Regardless, I'll review other syntax proposals along this line (RX1, RX2, etc.) in #\3208, and/or #14442 (closed).

Making it possible is already proposed in https://github.com/neovim/neovim/issues/3208 , via transpilation.

Respectfully submitted: it's not clear to me that #\3208 is proposing more that enabling End Users to switch to PCRE as a default regex engine (an important proposal). Again, something I'll have to research...but I hoped the "numbering" proposal above would open a dialog on this subject. [ It is not clear if I'm able to contribute to the discussion at #\3208 since the Issue is restricted/locked ].

Vim has great support for Perl, and Neovim has done a fantastic job streamlining the language API (for multiple different languages). I have a personal interest in this topic, since I've posted over 500 Raku one-liners at U&L StackExchange. While I normally test/run those Raku regex answers at the Vim command line, being able to set Raku as a default "Regex Engine" in Neovim would be a huge selling point (for myself, and possibly other Raku users):

https://unix.stackexchange.com/users/227738/jubilatious1

Regarding the "numbering" proposal: if another Regex Engine comes along that End Users favor, it could be added to the list. So for example: adding RX7 to denote Oniguruma (https://oniguruma.org/oniguruma.c/en/), adding RX8 to denote V8 (https://v8.dev/blog/non-backtracking-regexp), etc.

None of this has to be built all-at-once, but having a framework in place might attract Pull Requests.

Thank you for your time and attention.

clason commented 3 months ago

I appreciate your enthusiasm, but I really don't think the regex flavor of all things is what keeps people away from Vim -- if anything, it's the complexity of configuration and documentation, which this would increase significantly!

So unless a really compelling, actual(!) use case is presented, this is not something we want to pursue.

justinmk commented 3 months ago

being able to set Raku as a default "Regex Engine" in

Better support for all kinds of REPL interfaces is definitely something that is in scope. Regex engines would fit there. But I don't see why the regex engine needs to be swappable for /, :global, :substitute.

:help :command-preview allows plugins to extend the cmdline, so a plugin can provide :RakuGlobal, :RakuSubstitute, etc.