Closed stephane-archer closed 2 years ago
Duplicate of #3208
Also, do note that Vim regex is incredibly powerful and has editor-focused features that aren't available elsewhere. (https://github.com/neovim/neovim/issues/14442#issuecomment-826542674 for examples)
Perhaps the OP (@stephane-archer) means Perl6 Regular Expressions (aka PSIX/Raku), not PCRE (i.e. Perl5-based)?
[ICYMI, Perl6 was renamed "Raku" in 2019]:
• The PSIX dialect is currently used only in the Perl 6 programming language (now known as "Raku"), but is significant because it is the first major regex dialect to break away from the deep family resemblances shared by all other major dialects. • PSIX (aka "Raku dialect") was designed to make better use of the limited number of available metacharacters (i.e. punctuation and symbols). • The new dialect attempts to use metacharacters more consistently and predictably, to distribute them more “fairly” (i.e. shorter metasyntaxes for more frequently used constructs), and thereby to enhance the overall readability of its regexes.
Above quoted from "Everything You Know About Regexes Is Wrong" https://slides.yowconference.com/yowwest2015/Conway-EverythingYouKnowAboutRegexesIsWrong.pdf Copyright © Thoughtstream Pty Ltd, 2014-2015 http://damian.conway.org
Also see: https://docs.raku.org/language/regexes https://docs.raku.org/language/regexes-best-practices https://raku.org/
@stephane-archer
I am currently working on that and have a prototype already packaged into an app, which you can pass some PCRE compliant regex and get a regular expression that's compatible with Vim's new NFA Regex Engine, more tests are needed but it seems to work. Feel free to drop in and give insights or help out on it.
Can this Issue be merged into #3208 ?
FYI, the PDF linked above has a list of popular Regex Engines, in historical order:
0 Designator | Full name | Dialect used in...
1 BRE | POSIX 1003.2 (section 2.8) basic regular expressions | ed, sed, grep,
2 ERE | GNU extended regular expressions | egrep, gawk, Notepad++, vile, Tcl (with extra extensions)
3 EMACS | Emacs/Elisp regular expressions | emacs
4 VIM | Vim/Vimscript regular expressions | vim
5 PCRE | Perl-compatible regular expressions | The PCRE library, the .NET runtime, Apache, BBedit, C#, Delphi, Java (subset), JavaScript (subset), PHP, Perl 5 (with extra extensions), PowerShell, Python, R, Ruby, SAS, TextMate, Ultraedit, VB.NET
6 PSIX (RAKU) | Perl 6 Regular Expressions | Perl 6 (renamed Raku in 2019)
https://slides.yowconference.com/yowwest2015/Conway-EverythingYouKnowAboutRegexesIsWrong.pdf
I'd like to cut-to-the-chase here and suggest that Neovim adopt a "FUTURE-PROOF" strategy for selecting Regex Engines:
Vim
engine. This would slot into RX
number 0
, so that particular Regex Engine is used by default (typically Vim
Regex Engine, a.k.a. RX4
).RX4
(Vim) as their default Regex Engine, or RX5
(PCRE) as their default Regex Engine.RX5
, because PCRE is listed as the 5
th Regex Engine in the list above. This can be done on-the-fly (i.e. the default engine does not change).RX5
", providing an easy mnemonic for Perl(5) users, and the Perl6 (a.k.a. Raku) Engine slots in at "RX6
", providing an easy mnemonic for Perl6 users.I believe this proposal adds a "compatibility layer" that could make it easier for End Users to adapt a wide-variety of code to the Neovim ecosystem. For example, we might be able to get to the point where End Users could use RX1
to run grep
regexes, RX2
to run gawk
regexes, RX3
to run Emacs regexes, etc.
Thoughts, criticisms? Thank you.
make it easier for End Users to adapt a wide-variety of code to the Neovim ecosystem
Making it "easy" is not in scope until it becomes clear that the cost is worth it. Making it possible is already proposed in https://github.com/neovim/neovim/issues/3208 , via transpilation.
Making it "easy" is not in scope until it becomes clear that the cost is worth it.
Absolutely! However, I've often wondered if some End Users shied away from adopting Vim as their daily driver because they didn't want to learn the Vim regex engine--for example--heavy users of command-line grep
, egrep
, gawk
, etc. Regardless, I'll review other syntax proposals along this line (RX1
, RX2
, etc.) in #\3208, and/or #14442 (closed).
Making it possible is already proposed in https://github.com/neovim/neovim/issues/3208 , via transpilation.
Respectfully submitted: it's not clear to me that #\3208 is proposing more that enabling End Users to switch to PCRE
as a default regex engine (an important proposal). Again, something I'll have to research...but I hoped the "numbering" proposal above would open a dialog on this subject. [ It is not clear if I'm able to contribute to the discussion at #\3208 since the Issue is restricted/locked ].
Vim has great support for Perl, and Neovim has done a fantastic job streamlining the language API (for multiple different languages). I have a personal interest in this topic, since I've posted over 500 Raku one-liners at U&L StackExchange. While I normally test/run those Raku regex answers at the Vim command line, being able to set Raku as a default "Regex Engine" in Neovim would be a huge selling point (for myself, and possibly other Raku users):
https://unix.stackexchange.com/users/227738/jubilatious1
Regarding the "numbering" proposal: if another Regex Engine comes along that End Users favor, it could be added to the list. So for example: adding RX7
to denote Oniguruma (https://oniguruma.org/oniguruma.c/en/), adding RX8
to denote V8
(https://v8.dev/blog/non-backtracking-regexp), etc.
None of this has to be built all-at-once, but having a framework in place might attract Pull Requests.
Thank you for your time and attention.
I appreciate your enthusiasm, but I really don't think the regex flavor of all things is what keeps people away from Vim -- if anything, it's the complexity of configuration and documentation, which this would increase significantly!
So unless a really compelling, actual(!) use case is presented, this is not something we want to pursue.
being able to set Raku as a default "Regex Engine" in
Better support for all kinds of REPL interfaces is definitely something that is in scope. Regex engines would fit there. But I don't see why the regex engine needs to be swappable for /
, :global
, :substitute
.
:help :command-preview
allows plugins to extend the cmdline, so a plugin can provide :RakuGlobal
, :RakuSubstitute
, etc.
Feature already in Vim?
No response
Feature description
neovim is using yet another flavor of regular expressions that don't look anything like the extended or Perl regexp. It has even a "very magic" variant.
What a pain to learn and memories what ripgrep sed awk and neovim use. Things will be just better if they all use Perl regexp (or anything they agree on).
Is there any valid reason (except back-compatibility) to use vim regexp over the Perl version?