Speed lost on /^(foo|bar|baz)$/ match

p5pRT commented 16 years ago

Migrated from rt.perl.org#58280 (status was 'open')

Searchable as RT58280$

p5pRT commented 16 years ago

From @schwern

Created by @schwern

I was testing out an efficiency claim and discovered that 5.10 and bleadperl have both lost a lot of performance on /^(foo|bar|baz)$/ type regexes.

#!/usr/bin/perl -w

use strict; use warnings;

use Benchmark qw(cmpthese);

my $token = "open";

cmpthese(shift || -3\, { regex => sub { $token =~ m/\A (?: open | close | read ) \z/xms; }\, regex_opt => sub { $token =~ m/^(?:open|close|read)$/; }\, eq => sub { $token eq 'open' || $token eq 'close' || $token eq 'read'; } });

$ bleadperl ~/tmp/bug.plx 5000000 Rate regex_opt regex eq regex_opt 1222494/s -- -5% -85% regex 1282051/s 5% -- -84% eq 8064516/s 560% 529% --

$ perl5.8.8 ~/tmp/bug.plx 5000000 Rate regex_opt regex eq regex_opt 3048780/s -- -1% -54% regex 3086420/s 1% -- -53% eq 6578947/s 116% 113% --

$ perl5.10.0 ~/tmp/bug.plx 5000000 Rate regex regex_opt eq regex 1492537/s -- -0% -89% regex_opt 1497006/s 0% -- -89% eq 13157895/s 782% 779% --

Can anyone duplicate?

Perl Info

``` Flags: category=core severity=low Site configuration information for perl 5.11.0: Configured by schwern at Fri Aug 22 15:15:47 PDT 2008. Summary of my perl5 (revision 5 version 11 subversion 0 patch 34218) configuration: Platform: osname=darwin, osvers=8.11.1, archname=darwin-thread-multi-2level uname='darwin windhund.schwern.org 8.11.1 darwin kernel version 8.11.1: wed oct 10 18:23:28 pdt 2007; root:xnu-792.25.20~1release_i386 i386 i386 macbook1,1 darwin ' config_args='-de -Dprefix=/usr/local/perl/blead -Dusedevel -Duseithreads -Dccflags=-I/sw/include -Dldflags=-L/sw/lib -Dperladmin=schwern@pobox.com -Dcf_email=schwern@pobox.com -Dmyhostname=windhund -Dmydomain=.schwern.org -Dlibpth=/usr/local/lib /sw/lib /opt/local/lib /usr/lib -Uversiononly -Uinstallusrbinperl' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=undef, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-I/sw/include -fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -pipe -I/usr/local/include -I/opt/local/include', optimize='-O3', cppflags='-no-cpp-precomp -I/sw/include -fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -pipe -I/usr/local/include -I/opt/local/include' ccversion='', gccversion='4.0.1 (Apple Computer, Inc. build 5370)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='-L/sw/lib -L/usr/local/lib -L/opt/local/lib' libpth=/usr/local/lib /sw/lib /opt/local/lib /usr/lib libs=-lgdbm -ldbm -ldb -ldl -lm -lc perllibs=-ldl -lm -lc libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false, libperl=libperl.a gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags='-L/sw/lib -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib' Locally applied patches: DEVEL @INC for perl 5.11.0: /sw/lib/perl5/darwin-thread-multi-2level /sw/lib/perl5 /sw/lib/perl5/darwin /usr/local/perl/blead/lib/5.11.0/darwin-thread-multi-2level /usr/local/perl/blead/lib/5.11.0 /usr/local/perl/blead/lib/site_perl/5.11.0/darwin-thread-multi-2level /usr/local/perl/blead/lib/site_perl/5.11.0 /usr/local/perl/blead/lib/site_perl/5.10.0 /usr/local/perl/blead/lib/site_perl . Environment for perl 5.11.0: DYLD_LIBRARY_PATH (unset) HOME=/Users/schwern LANG (unset) LANGUAGE (unset) LC_CTYPE=en_US.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/Users/schwern/bin:/usr/local/bin:/opt/local/bin:/opt/local/sbin:/sw/bin:/sw/sbin:/Users/schwern/bin:/usr/local/bin:/usr/local/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R6/bin PERL5LIB=/sw/lib/perl5:/sw/lib/perl5/darwin PERL_BADLANG (unset) SHELL=/bin/bash ```

p5pRT commented 16 years ago

From @smpeters

On Fri Aug 22 15:28:29 2008\, schwern wrote:

This is a bug report for perl from schwern@pobox.com\, generated with the help of perlbug 1.39 running under perl 5.11.0.

----------------------------------------------------------------- [Please describe your issue here]

I was testing out an efficiency claim and discovered that 5.10 and bleadperl have both lost a lot of performance on /^(foo|bar|baz)$/ type regexes.

#!/usr/bin/perl -w

use strict; use warnings;

use Benchmark qw(cmpthese);

my $token = "open";

cmpthese(shift || -3\, { regex => sub { $token =~ m/\A (?: open | close | read ) \z/xms; }\, regex_opt => sub { $token =~ m/^(?:open|close|read)$/; }\, eq => sub { $token eq 'open' || $token eq 'close' || $token eq 'read'; } });

$ bleadperl ~/tmp/bug.plx 5000000 Rate regex_opt regex eq regex_opt 1222494/s -- -5% -85% regex 1282051/s 5% -- -84% eq 8064516/s 560% 529% --

$ perl5.8.8 ~/tmp/bug.plx 5000000 Rate regex_opt regex eq regex_opt 3048780/s -- -1% -54% regex 3086420/s 1% -- -53% eq 6578947/s 116% 113% --

$ perl5.10.0 ~/tmp/bug.plx 5000000 Rate regex regex_opt eq regex 1492537/s -- -0% -89% regex_opt 1497006/s 0% -- -89% eq 13157895/s 782% 779% --

Can anyone duplicate?

Yes\, I can duplicate. The first is 5.8.8. The second is bleadperl.

Steve Peters steve@fisharerojo.org

steve@picard:\~/perl-current$ perl bench.pl 5000000 Rate regex_opt regex eq regex_opt 1886792/s -- -0% -61% regex 1893939/s 0% -- -61% eq 4807692/s 155% 154% -- steve@picard:\~/perl-current$ ./perl -Ilib bench.pl Rate regex_opt regex eq regex_opt 1016789/s -- -2% -85% regex 1035642/s 2% -- -84% eq 6569214/s 546% 534% --

p5pRT commented 16 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 16 years ago

From nospam-abuse@bloodgate.com

Moin\,

On Wednesday 27 August 2008 05:16:12 Steve Peters via RT wrote:

On Fri Aug 22 15:28:29 2008\, schwern wrote:

Can anyone duplicate?

Yes\, I can duplicate. The first is 5.8.8. The second is bleadperl.

What I find interesting is that the "eq" case got faster :)

All the best\,

Tels

-- Signed on Wed Aug 27 19:13:34 2008 with key 0x93B84C15. Get one of my photo posters: http://bloodgate.com/posters PGP key on http://bloodgate.com/tels.asc or per email.

"Never offend people with style when you can offend them with substance."

-- Sam Brown

p5pRT commented 15 years ago

From @demerphq

Given the pattern involved at first look it seems this is related to the TRIE functionality.

With perl 5.8:

demerphq@dromedary:blead:\~/perl$ perl bug58280.pl Rate regex regex_opt eq regex 2254593/s -- -3% -60% regex_opt 2334045/s 4% -- -58% eq 5618599/s 149% 141% --

With blead:

demerphq@dromedary:blead:\~/perl$ ./perl -Ilib bug58280.pl Rate regex regex_opt eq regex 705456/s -- -1% -86% regex_opt 709993/s 1% -- -86% eq 5202675/s 637% 633% --

This is with the trie optimisation disabled: demerphq@dromedary:blead:\~/perl$ ./perl -Ilib bug58280.pl Rate regex regex_opt eq regex 1260232/s -- -1% -76% regex_opt 1275183/s 1% -- -76% eq 5272268/s 318% 313% --

From this I assume that the majority of the slowdown comes from the setup time for doing a match using the new non-recursive process and not from the TRIE.

So then what happens when we change the token being matched? After all the benchmark is:

'open'=~/^(open|close|read)$/

Which is a benchmark which is virtually designed to make the old alternation implementation look good. So what happens when we switch it to "read"?

With 5.8 we see the expected slowdown:

demerphq@dromedary:blead:\~/perl$ perl bug58280.pl Rate regex regex_opt eq regex 1846732/s -- -11% -23% regex_opt 2071288/s 12% -- -14% eq 2406042/s 30% 16% --

With blead we see the expected unchanged performance of a trie:

demerphq@dromedary:blead:\~/perl$ ./perl -Ilib bug58280.pl Rate regex regex_opt eq regex 723692/s -- -1% -62% regex_opt 730718/s 1% -- -62% eq 1904342/s 163% 161% --

So now\, if we add a bunch more terms to the test:

#!/usr/bin/perl -w

use strict; use warnings;

use Benchmark qw(cmpthese);

my $token = "read";

cmpthese(shift || -3\, { regex => sub { $token =~ m/\A (?: open | close | foo | bar | baz | bop | dizzy | blitzen | rocker | mod | punk | read ) \z/xms; }\, regex_opt => sub { $token =~ m/^(?:open|close|foo|bar|baz|bop|dizzy|blitzen|rocker|mod|punk|read)$/; }\, 'eq' => sub { $token eq 'open' || $token eq 'close' || $token eq 'foo' || $token eq 'bar' || $token eq 'baz' || $token eq 'bop' || $token eq 'dizzy' || $token eq 'blitzen' || $token eq 'rocker' || $token eq 'mod' || $token eq 'punk' || $token eq 'read'; } });

We see perl 5.8's performance continue to fall off:

demerphq@dromedary:blead:\~/perl$ perl bug58280.pl Rate eq regex regex_opt eq 874033/s -- -35% -35% regex 1348988/s 54% -- -0% regex_opt 1353136/s 55% 0% --

And we see perl 5.10's performance again stay more or less static:

demerphq@dromedary:blead:\~/perl$ ./perl -Ilib bug58280.pl Rate eq regex_opt regex eq 601754/s -- -14% -14% regex_opt 696555/s 16% -- -1% regex 703210/s 17% 1% --

Also\, the performance difference of the 'eq' cases suggests that perl as a whole is a bit slower in perl5.10\, this is nothing new afaik\, perl has been getting somewhat slower with each release.

Anyway\, my conclusion is that this isnt really a bug. Its a place where the changes in the regex engine result in a loss of speed\, but it is cherry picked to do so. Sure we could probably do some work to make this class of pattern not use the trie logic because the number of options are so low\, and because the pattern is fully anchored\, but that isn't going to save much. The end result will afaict still be half as fast simply because setting up the non-recursive engine is more expensive than setting up the recursive one. But of course this comes at a trade off\, the new engine wont SEGV\, and the new engine won't see a linear falloff in performance due to large alternations. So we trade speed for stability and predictable performance. Its hard to say that is a bug.

I view this more as a notice that we could spend some time making regex startup less expensive\, if we can do so without losing features. The TRIE aspect IMO in this case is a lesser concern. Although i admit it could be optimised. The compressed scheme we use is massive overkill for the type of pattern we have in this bug. It is designed with huge transition tables in mind\, not the more common smaller ones that would come from the pattern in this bug.

Yves

p5pRT commented 15 years ago

From @schwern

FWIW it got worse. perl5.10.1 here is the latest maint5.10.

$ time perl5.10.0 ~/tmp/bench.plx 5000000 Rate regex regex_opt eq regex 1366120/s -- -0% -86% regex_opt 1369863/s 0% -- -86% eq 9615385/s 604% 602% --

real 0m21.322s user 0m20.494s sys 0m0.176s

$ time perl5.10.1 ~/tmp/bench.plx 5000000 Rate regex_opt regex eq regex_opt 1075269/s -- -6% -85% regex 1146789/s 7% -- -84% eq 7246377/s 574% 532% --

real 0m23.734s user 0m23.284s sys 0m0.188s

Even so\, I don't think this should be holding up 5.10.1.

p5pRT commented 14 years ago

From @obra

On Tue Jul 07 19:34:55 2009\, schwern wrote:

FWIW it got worse. perl5.10.1 here is the latest maint5.10.

$ time perl5.10.0 ~/tmp/bench.plx 5000000 Rate regex regex_opt eq regex 1366120/s -- -0% -86% regex_opt 1369863/s 0% -- -86% eq 9615385/s 604% 602% --

real 0m21.322s user 0m20.494s sys 0m0.176s

$ time perl5.10.1 ~/tmp/bench.plx 5000000 Rate regex_opt regex eq regex_opt 1075269/s -- -6% -85% regex 1146789/s 7% -- -84% eq 7246377/s 574% 532% --

real 0m23.734s user 0m23.284s sys 0m0.188s

Even so\, I don't think this should be holding up 5.10.1.

I'm removing this performance issue as a 5.12 blocker as it's not a regression from 5.10 and yves makes a reasonable NOTABUG argument in the ticket history

p5pRT commented 12 years ago

From @jkeenan

On Tue Jul 07 19:34:55 2009\, schwern wrote:

FWIW it got worse. perl5.10.1 here is the latest maint5.10.

FWIW\, here are some comparisons:

Linux/i386:

$ /usr/bin/perl 58280.pl 5000000 Perl version: 5.010000 Rate regex regex_opt eq regex 1689189/s -- -7% -75% regex_opt 1818182/s 8% -- -73% eq 6666667/s 295% 267% --

$ /usr/local/bin/perl5.12.0 58280.pl 5000000 Perl version: 5.012000 Rate regex regex_opt eq regex 1824818/s -- -1% -85% regex_opt 1845018/s 1% -- -85% eq 12195122/s 568% 561% --

$ perl 58280.pl 5000000 Perl version: 5.014000 Rate regex_opt regex eq regex_opt 2793296/s -- -6% -77% regex 2976190/s 7% -- -75% eq 11904762/s 326% 300% --

Darwin/PPC:

$ /usr/bin/perl 58280.pl 5000000 Perl version: 5.008006 Rate regex regex_opt eq regex 1059322/s -- -15% -65% regex_opt 1250000/s 18% -- -59% eq 3067485/s 190% 145% --

$ /usr/local/bin/perl5.12.0 58280.pl 5000000 Perl version: 5.012000 Rate regex regex_opt eq regex 314663/s -- -4% -86% regex_opt 326797/s 4% -- -85% eq 2232143/s 609% 583% --

$ perl 58280.pl 5000000 Perl version: 5.014002 Rate regex_opt regex eq regex_opt 677507/s -- -4% -86% regex 706215/s 4% -- -85% eq 4672897/s 590% 562% --

It appears that things improved on both OSes between 5.12 and 5.14.

Does this ticket have to remain open?

Thank you very much. Jim Keenan

p5pRT commented 12 years ago

From @iabyn

On Tue\, May 01\, 2012 at 06:22:00PM -0700\, James E Keenan via RT wrote:

On Tue Jul 07 19:34:55 2009\, schwern wrote:

FWIW it got worse. perl5.10.1 here is the latest maint5.10. FWIW\, here are some comparisons: [snip] It appears that things improved on both OSes between 5.12 and 5.14.

Does this ticket have to remain open?

It shows that things have got better\, but still not as good as under 5.8.x. I think it would be worth doing some profiling to see if there's anything further that can be tweaked\, or whether its just the inevitable slowdown due to improved utf8 handling\, trie code etc etc.

In the meantime I suggest keeping the ticket open.

-- Music lesson: a symbiotic relationship whereby a pupil's embellishments concerning the amount of practice performed since the last lesson are rewarded with embellishments from the teacher concerning the pupil's progress over the corresponding period.

p5pRT commented 10 years ago

From @khwilliamson

Here are the 5.19.10+ blead numbers on a Linux box:

Rate regex regex_opt eq regex 7650757/s -- -1% -84% regex_opt 7717895/s 1% -- -84% eq 48383994/s 532% 527% --

And if we change $token to 'read'\, we get

Rate regex regex_opt eq regex 7748530/s -- -2% -62% regex_opt 7899967/s 2% -- -61% eq 20146179/s 160% 155% --

-- Karl Williamson

p5pRT commented 10 years ago

From [Unknown Contact. See original ticket]

Here are the 5.19.10+ blead numbers on a Linux box:

Rate regex regex_opt eq regex 7650757/s -- -1% -84% regex_opt 7717895/s 1% -- -84% eq 48383994/s 532% 527% --

And if we change $token to 'read'\, we get

Rate regex regex_opt eq regex 7748530/s -- -2% -62% regex_opt 7899967/s 2% -- -61% eq 20146179/s 160% 155% --

-- Karl Williamson

p5pRT commented 10 years ago

From @iabyn

On Sat\, Mar 29\, 2014 at 08:29:46PM -0700\, Karl Williamson via RT wrote:

Here are the 5.19.10+ blead numbers on a Linux box:
            Rate     regex regex\_opt        eq
regex 7650757/s -- -1% -84% regex_opt 7717895/s 1% -- -84% eq 48383994/s 532% 527% --

And if we change $token to 'read'\, we get
            Rate     regex regex\_opt        eq
regex 7748530/s -- -2% -62% regex_opt 7899967/s 2% -- -61% eq 20146179/s 160% 155% --

I think the important thing is how much slower the trie optimisation is for simple alternations that wouldn't benefit. Comparing a range of perl versions on the same hardware\, I get for the original bug report code (where $token matches the first alternation):

5.008009 regex 10526316/s 5.010001 regex 3676471/s 5.012005 regex 3367003/s 5.014004 regex 5665722/s 5.016003 regex 6756757/s 5.018002 regex 5602241/s 5.019010 regex 6211180/s

And here's the same with TRIEs disabled (${^RE_TRIE_MAXBUF} = -1):

5.008009 regex 10309278/s 5.010001 regex 7017544/s 5.012005 regex 6993007/s 5.014004 regex 6920415/s 5.016003 regex 8130081/s 5.018002 regex 6451613/s 5.019010 regex 6644518/s

and with TRIEs enabled and $token as 'read'\, i.e. $token matches the third alternation:

5.008009 regex 8438819/s 5.010001 regex 3809524/s 5.012005 regex 3344482/s 5.014004 regex 5917160/s 5.016003 regex 6711409/s 5.018002 regex 5524862/s 5.019010 regex 6097561/s

and with TRIEs enabled and with Yves modification\, i.e. $token matches the 12th alternation:

5.008009 regex 4807692/s 5.010001 regex 3724395/s 5.012005 regex 3367003/s 5.014004 regex 5952381/s 5.016003 regex 6688963/s 5.018002 regex 5555556/s 5.019010 regex 6042296/s

What I conclude from the above is that:

* as with all such micro benchmarks\, a certain amount of the variation is just due to the luck of the compiler\, e.g. instruction cache alignment; * simple alternations slow down in proportion to the number of alternatives that are tried (as you would expect); * TRIEs match in roughly constant time regardless of which alternation * matches (as you would expect); * even non-TRIE alternations are slower in later perls\, most likely due to all the extra overhead we've accumulated\, such as utf8 correctness\, not blowing the stack on recursion etc. * TRIEs have got faster in 5.14; * But TRIEs still have a bigger overhead than plain alts\, so it may be worth looking at * profiling to see if the overhead can be reduced; * seeing whether there's a cut-off point where it's not worth building a TRIE\, e.g. if there are less than N alternatives; and there might be other factors such as lengths of the alt strings\, whether its utf8 etc.

-- The warp engines start playing up a bit\, but seem to sort themselves out after a while without any intervention from boy genius Wesley Crusher. -- Things That Never Happen in "Star Trek" #17

p5pRT commented 10 years ago

From @demerphq

On 1 April 2014 17:40\, Dave Mitchell \davem@iabyn\.com wrote:

On Sat\, Mar 29\, 2014 at 08:29:46PM -0700\, Karl Williamson via RT wrote:
Here are the 5.19.10+ blead numbers on a Linux box:
            Rate     regex regex\_opt        eq
regex 7650757/s -- -1% -84% regex_opt 7717895/s 1% -- -84% eq 48383994/s 532% 527% --

And if we change $token to 'read'\, we get
            Rate     regex regex\_opt        eq
regex 7748530/s -- -2% -62% regex_opt 7899967/s 2% -- -61% eq 20146179/s 160% 155% --
I think the important thing is how much slower the trie optimisation is for simple alternations that wouldn't benefit. Comparing a range of perl versions on the same hardware\, I get for the original bug report code (where $token matches the first alternation):
5\.008009 regex 10526316/s
5\.010001 regex  3676471/s
5\.012005 regex  3367003/s
5\.014004 regex  5665722/s
5\.016003 regex  6756757/s
5\.018002 regex  5602241/s
5\.019010 regex  6211180/s
And here's the same with TRIEs disabled (${^RE_TRIE_MAXBUF} = -1):
5\.008009 regex 10309278/s
5\.010001 regex  7017544/s
5\.012005 regex  6993007/s
5\.014004 regex  6920415/s
5\.016003 regex  8130081/s
5\.018002 regex  6451613/s
5\.019010 regex  6644518/s
and with TRIEs enabled and $token as 'read'\, i.e. $token matches the third alternation:
5\.008009 regex  8438819/s
5\.010001 regex  3809524/s
5\.012005 regex  3344482/s
5\.014004 regex  5917160/s
5\.016003 regex  6711409/s
5\.018002 regex  5524862/s
5\.019010 regex  6097561/s
and with TRIEs enabled and with Yves modification\, i.e. $token matches the 12th alternation:
5\.008009 regex  4807692/s
5\.010001 regex  3724395/s
5\.012005 regex  3367003/s
5\.014004 regex  5952381/s
5\.016003 regex  6688963/s
5\.018002 regex  5555556/s
5\.019010 regex  6042296/s
What I conclude from the above is that:

Pretty much the same thing I concluded too. :-)

* as with all such micro benchmarks\, a certain amount of the variation is just due to the luck of the compiler\, e.g. instruction cache alignment; * simple alternations slow down in proportion to the number of alternatives that are tried (as you would expect); * TRIEs match in roughly constant time regardless of which alternation * matches (as you would expect); * even non-TRIE alternations are slower in later perls\, most likely due to all the extra overhead we've accumulated\, such as utf8 correctness\, not blowing the stack on recursion etc. * TRIEs have got faster in 5.14; * But TRIEs still have a bigger overhead than plain alts\, so it may be worth looking at * profiling to see if the overhead can be reduced;

It definitely can be. Unusually we trade speed for space in the trie algorithm. Something I now regret. Instead of supporting super large alphabets and running slow we should have just ignored large alphabets and run fast.

If I get time I will look at changing the trie/aho code to use the fast representation. I dont have many tuits however.

* seeing whether there's a cut-off point where it's not worth building a TRIE\, e.g. if there are less than N alternatives; and there might be other factors such as lengths of the alt strings\, whether its utf8 etc.

I also agree here.

Yves

khwilliamson commented 2 years ago

@demerphq, ping

demerphq commented 1 year ago

I am not sure what to do about this. We use a compressed state transition table that means each transition is more expensive than it could be, but which gracefully handles large alphabet patterns. We actually have two construction methods however, one using a fixed width uncompressed "standard" table, and one using a more complex list representation for large alphabets, at the end of construction we convert both of those formats into a final compressed format.

The idea behind the compressed format is from the "red dragon book". You can see it in action below.

   ./perl -Ilib -Mre=Debug,TRIE,TRIEC,TRIEM -e'qr/(these|are|different|words|that|I|know|and|write|knotted|sentences|from|for|serious|fun)/'
Compiling REx "(these|are|different|words|that|I|know|and|write|knotted|sen"...
  Looking for TRIE'able sequences. Tail node is  57:CLOSE1
  ...
    make_trie start==3, first==3, last==57, tail==57 depth=1
    TRIE(NATIVE): W:15 C:72 Uq:17 Min:1 Max:9
    Compiling trie using table compiler
      Char :    t   h   e   s   a   r   d   i   f   n   w   o   I   k   c   m   u
      State+---------------------------------------------------------------------
         1 :    2   .   .  29   7   .   A   .  32   .  13   .  1A  1B   .   .   . (   8)
         2 :    .   3   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         3 :    .   .   4   .  18   .   .   .   .   .   .   .   .   .   .   .   . (   2)
         4 :    .   .   .   5   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         5 :    .   .   6   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         6 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   1
         7 :    .   .   .   .   .   8   .   .   .  1F   .   .   .   .   .   .   . (   2)
         8 :    .   .   9   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         9 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   2
         A :    .   .   .   .   .   .   .   B   .   .   .   .   .   .   .   .   . (   1)
         B :    .   .   .   .   .   .   .   .   C   .   .   .   .   .   .   .   . (   1)
         C :    .   .   .   .   .   .   .   .   D   .   .   .   .   .   .   .   . (   1)
         D :    .   .   E   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         E :    .   .   .   .   .   F   .   .   .   .   .   .   .   .   .   .   . (   1)
         F :    .   .  10   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        10 :    .   .   .   .   .   .   .   .   .  11   .   .   .   .   .   .   . (   1)
        11 :   12   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        12 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   3
        13 :    .   .   .   .   .  21   .   .   .   .   .  14   .   .   .   .   . (   2)
        14 :    .   .   .   .   .  15   .   .   .   .   .   .   .   .   .   .   . (   1)
        15 :    .   .   .   .   .   .  16   .   .   .   .   .   .   .   .   .   . (   1)
        16 :    .   .   .  17   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        17 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   4
        18 :   19   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        19 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   5
        1A :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   6
        1B :    .   .   .   .   .   .   .   .   .  1C   .   .   .   .   .   .   . (   1)
        1C :    .   .   .   .   .   .   .   .   .   .   .  1D   .   .   .   .   . (   1)
        1D :   25   .   .   .   .   .   .   .   .   .  1E   .   .   .   .   .   . (   2)
        1E :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   7
        1F :    .   .   .   .   .   .  20   .   .   .   .   .   .   .   .   .   . (   1)
        20 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   8
        21 :    .   .   .   .   .   .   .  22   .   .   .   .   .   .   .   .   . (   1)
        22 :   23   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        23 :    .   .  24   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        24 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   9
        25 :   26   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        26 :    .   .  27   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        27 :    .   .   .   .   .   .  28   .   .   .   .   .   .   .   .   .   . (   1)
        28 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   A
        29 :    .   .  2A   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        2A :    .   .   .   .   .  38   .   .   .  2B   .   .   .   .   .   .   . (   2)
        2B :   2C   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        2C :    .   .  2D   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        2D :    .   .   .   .   .   .   .   .   .  2E   .   .   .   .   .   .   . (   1)
        2E :    .   .   .   .   .   .   .   .   .   .   .   .   .   .  2F   .   . (   1)
        2F :    .   .  30   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        30 :    .   .   .  31   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        31 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   B
        32 :    .   .   .   .   .  33   .   .   .   .   .  36   .   .   .   .  3D (   3)
        33 :    .   .   .   .   .   .   .   .   .   .   .  34   .   .   .   .   . (   1)
        34 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .  35   . (   1)
        35 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   C
        36 :    .   .   .   .   .  37   .   .   .   .   .   .   .   .   .   .   . (   1)
        37 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   D
        38 :    .   .   .   .   .   .   .  39   .   .   .   .   .   .   .   .   . (   1)
        39 :    .   .   .   .   .   .   .   .   .   .   .  3A   .   .   .   .   . (   1)
        3A :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .  3B (   1)
        3B :    .   .   .  3C   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        3C :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   E
        3D :    .   .   .   .   .   .   .   .   .  3E   .   .   .   .   .   .   . (   1)
        3E :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   F
    Alloc: 1242 Orig: 1055 elements, Final:62. Savings of %94.12
    Statecount:3f Lasttrans:3f
      Char : Match Base  Ofs     t   h   e   s   a   r   d   i   f   n   w   o   I   k   c   m   u
      State|-------------------------------------------------------------------------------------------
      #   1|       @  11 + 0[    2   .   .  29   7   .   A   .  32   .  13   .  1A  1B   .   .   .]
      #   2|       @  11 + 1[    .   3   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   3|       @  1D + 2[    .   .   4   .  18   .   .   .   .   .   .   .   .   .   .   .   .]
      #   4|       @  10 + 3[    .   .   .   5   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   5|       @  14 + 2[    .   .   6   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   6| W   1 @   0 
      #   7|       @  1D + 5[    .   .   .   .   .   8   .   .   .  1F   .   .   .   .   .   .   .]
      #   8|       @  16 + 2[    .   .   9   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   9| W   2 @   0 
      #   A|       @  13 + 7[    .   .   .   .   .   .   .   B   .   .   .   .   .   .   .   .   .]
      #   B|       @  14 + 8[    .   .   .   .   .   .   .   .   C   .   .   .   .   .   .   .   .]
      #   C|       @  18 + 8[    .   .   .   .   .   .   .   .   D   .   .   .   .   .   .   .   .]
      #   D|       @  21 + 2[    .   .   E   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   E|       @  1F + 5[    .   .   .   .   .   F   .   .   .   .   .   .   .   .   .   .   .]
      #   F|       @  23 + 2[    .   .  10   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  10|       @  1E + 9[    .   .   .   .   .   .   .   .   .  11   .   .   .   .   .   .   .]
      #  11|       @  28 + 0[   12   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  12| W   3 @   0 
      #  13|       @  24 + 5[    .   .   .   .   .  21   .   .   .   .   .  14   .   .   .   .   .]
      #  14|       @  25 + 5[    .   .   .   .   .  15   .   .   .   .   .   .   .   .   .   .   .]
      #  15|       @  25 + 6[    .   .   .   .   .   .  16   .   .   .   .   .   .   .   .   .   .]
      #  16|       @  29 + 3[    .   .   .  17   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  17| W   4 @   0 
      #  18|       @  2D + 0[   19   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  19| W   5 @   0 
      #  1A| W   6 @   0 
      #  1B|       @  25 + 9[    .   .   .   .   .   .   .   .   .  1C   .   .   .   .   .   .   .]
      #  1C|       @  25 + B[    .   .   .   .   .   .   .   .   .   .   .  1D   .   .   .   .   .]
      #  1D|       @  31 + 0[   25   .   .   .   .   .   .   .   .   .  1E   .   .   .   .   .   .]
      #  1E| W   7 @   0 
      #  1F|       @  2C + 6[    .   .   .   .   .   .  20   .   .   .   .   .   .   .   .   .   .]
      #  20| W   8 @   0 
      #  21|       @  2C + 7[    .   .   .   .   .   .   .  22   .   .   .   .   .   .   .   .   .]
      #  22|       @  34 + 0[   23   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  23|       @  33 + 2[    .   .  24   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  24| W   9 @   0 
      #  25|       @  36 + 0[   26   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  26|       @  35 + 2[    .   .  27   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  27|       @  32 + 6[    .   .   .   .   .   .  28   .   .   .   .   .   .   .   .   .   .]
      #  28| W   A @   0 
      #  29|       @  37 + 2[    .   .  2A   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  2A|       @  37 + 5[    .   .   .   .   .  38   .   .   .  2B   .   .   .   .   .   .   .]
      #  2B|       @  3A + 0[   2C   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  2C|       @  3B + 2[    .   .  2D   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  2D|       @  35 + 9[    .   .   .   .   .   .   .   .   .  2E   .   .   .   .   .   .   .]
      #  2E|       @  31 + E[    .   .   .   .   .   .   .   .   .   .   .   .   .   .  2F   .   .]
      #  2F|       @  3F + 2[    .   .  30   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  30|       @  3F + 3[    .   .   .  31   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  31| W   B @   0 
      #  32|       @  3E + 5[    .   .   .   .   .  33   .   .   .   .   .  36   .   .   .   0  3D]
      #  33|       @  39 + B[    .   .   .   .   .   .   .   .   .   .   .  34   .   .   .   .   .]
      #  34|       @  36 + F[    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .  35   .]
      #  35| W   C @   0 
      #  36|       @  41 + 5[    .   .   .   .   .  37   .   .   .   .   .   .   .   .   .   .   .]
      #  37| W   D @   0 
      #  38|       @  40 + 7[    .   .   .   .   .   .   .  39   .   .   .   .   .   .   .   .   .]
      #  39|       @  3D + B[    .   .   .   .   .   .   .   .   .   .   .  3A   .   .   .   .   .]
      #  3A|       @  3A +10[    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .  3B]
      #  3B|       @  48 + 3[    .   .   .  3C   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  3C| W   E @   0 
      #  3D|       @  43 + 9[    .   .   .   .   .   .   .   .   .  3E   .   .   .   .   .   .   .]
      #  3E| W   F @   0 
    word_info N:(prev,len)= 1:(0,5) 2:(0,3) 3:(0,9) 4:(0,5) 5:(0,4) 6:(0,1) 7:(0,4) 8:(0,3) 9:(0,5) 10:(0,7) 11:(0,9) 12:(0,4) 13:(0,3) 14:(0,7) 15:(0,3)
Stclass Failtable (63 states): 0, 0, 1, 1, 1, 41, 42, 1, 1, 1, 1, 1, 50, 50, 1, 1, 1, 1, 2, 1, 1, 1, 10, 41, 7, 2, 1, 1, 1, 1, 19, 1, 10, 1, 1, 2, 1, 2, 2, 1, 10, 1, 1, 1, 2, 1, 1, 1, 1, 41, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 41, 1, 1
Final program:
   1: OPEN1 (3)
   3:   TRIEC-EXACT<S:1/62 W:15 L:1/9 C:72/17>[Iadfkstw] (57)
        <these> 
        <are> 
        <different> 
        <words> 
        <that> 
        <I> 
        <know> 
        <and> 
        <write> 
        <knotted> 
        <sentences> 
        <from> 
        <for> 
        <serious> 
        <fun> 
  57: CLOSE1 (59)
  59: END (0)
stclass AHOCORASICKC-EXACT<S:1/62 W:15 L:1/9 C:72/17>[Iadfkstw] minlen 1 
Freeing REx: "(these|are|different|words|that|I|know|and|write|knotted|sen"...

We build a map table that converts the unique characters into ids, which are then represented by columns in the tables.

The first table shows the "native construction" (tabular), the second shows the same table in compressed form.

The idea behind the compression is to overlay states on top of each other so they can use each other "fail transitions" for their own "real transitions", by using a "check" value for each transition. When a state looks at a transition that it doesn't own, it knows it is actually a fail transition. This means each state has its own offset into the transition blob. Notice that in the typical state transition table there are many states that only have one transition, in fact besides the initial state most of the state transitions are fails.

This means that where the typical dfa loop would be something like:

while (state && str < str_end) {
   input = *str++;
   state = trans[state * num_states + input];
}

we have code more like this:

while (state && str < str_end) {
   input= chr_map[*str++];
   if (!input) break;
   else input--;
   ofs = state_offset[state];
   if (trans[ofs + input].check != state) break;
   state = trans[ofs+input].state;
}

I would guess this extra bookkeeping accounts for some of the slowdown, but it is also what makes it possible to have patterns with thousands of distinct characters in them without blowing out ram too much. The other thing is each trie opcode has a fairly high setup cost in the regex engine itself. We have to initialize various vars to have the right tables, etc. It is certainly possible we could rethink things so that we can use a more efficient loop, and reduce the setup costs. I am not sure I am up to the task in the short term. But i thought would dump this here for feedback.

Perl / perl5