universal-ctags / ctags

A maintained ctags implementation
https://ctags.io
GNU General Public License v2.0
6.55k stars 625 forks source link

lregex: per language option to use pcre2 as default regex engine #3181

Open masatake opened 3 years ago

masatake commented 3 years ago

When thinking about performance, a user may want to use pcre2 as the default engine. See #1861.

Adding {pcre2} to existing parsers defined in optlib/*.ctags is not acceptable. If we do so, these parsers suddenly cannot be used where pcre2 is not available. So I'm thinking about adding a per language option to force using pcre2.

--use-pcre2-<LANG> option

If you find a better name of the option, let me know.

There is a risk that the force-enabled-pcre2 parser doesn't work expectedly because of the difference in the engine. We must write the risk in the man page when the option is introduced.

masatake commented 3 years ago

The per-language option may be over-engineering. --use-pcre2 is enough.

masatake commented 3 years ago

As I'm afraid, I could not implement --use-pcre2 option in a clean way. The order of command-line parsing and lazy parser initialization conflict. This area has been a hotbed of critical bugs

masatake commented 3 years ago

Using pcre2 is about 1.5 times faster. I didn't compare the output of tags files. The number of tags extracted by the two regex engines is the same.

$ CTAGS_EXE=/home/jet/var/ctags-github/ctags ./codebase ctags CMake          
version: 8f924bac
features: +wildcards +regex +gnulib_regex +iconv +option-directory +xpath +json +interactive +sandbox +yaml +packcc +optscript +pcre2
log: results/8f924bac,CMake...............,..........,time......,default...,2021-10-31-05:08:12.log
tagsoutput: /dev/null
cmdline: + /home/jet/var/ctags-github/ctags --quiet --options=NONE --sort=no --options=profile.d/maps --totals=yes --languages=CMake -o - -R code/opencv
197 files, 23765 lines (922 kB) scanned in 0.1 seconds (7863 kB/s)
3160 tags added to tag file

real    0m0.123s
user    0m0.107s
sys 0m0.016s
+ set +x
$ CTAGS_EXE=/home/jet/var/ctags-github/ctags ./codebase ctags CMake pcre2
version: 8f924bac
features: +wildcards +regex +gnulib_regex +iconv +option-directory +xpath +json +interactive +sandbox +yaml +packcc +optscript +pcre2
log: results/8f924bac,CMake...............,pcre2.....,time......,default...,2021-10-31-05:08:15.log
tagsoutput: /dev/null
cmdline: + /home/jet/var/ctags-github/ctags --quiet --options=NONE --sort=no --options=profile.d/maps --options=profile.d/pcre2.ctags --totals=yes --languages=CMake -o - -R code/opencv
197 files, 23765 lines (922 kB) scanned in 0.1 seconds (12378 kB/s)
3160 tags added to tag file

real    0m0.080s
user    0m0.067s
sys 0m0.012s
+ set +x