regex very slow at CT - Githubissues

timotheecour commented 3 years ago

example 1

nim c -d:danger tests/tests.nim 51 seconds

tests/tests 0.04 seconds

notes

I'm not sure whether it's due to VM code executing tests at CT or whether it's the compilation time itself, but the slow compilation time is noticeable.

profiling

nim c --profileVM tests/tests.nim shows

prof:     µs    #instr  location
    20773272      4435  /Users/timothee/git_clone/nim/nim-regex/src/regex/compiler.nim(26, 6)
    20679694     23062  /Users/timothee/git_clone/nim/nim-regex/src/regex/compiler.nim(10, 6)
    18337658    468986  /Users/timothee/git_clone/nim/nim-regex/src/regex.nim(289, 8)
     6562522      9757  /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(500, 6)
     6457045     12889  /Users/timothee/git_clone/nim/nim-regex/src/regex/litopt.nim(232, 6)
     5824885    171468  /Users/timothee/git_clone/nim/nim-regex/src/regex/common.nim(69, 6)
     5792159    192220  /Users/timothee/git_clone/nim/nim-regex/src/regex/common.nim(58, 6)
     5709299      7983  /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(315, 6)
     5682670    326774  /Users/timothee/git_clone/nim/Nim_devel/lib/pure/strutils.nim(2679, 6)
     5496339  12001524  /Users/timothee/git_clone/nim/Nim_devel/lib/pure/strutils.nim(2616, 6)
     5082979     15966  /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(487, 6)
     4869667    736222  /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(50, 6)
     4473871    573475  /Users/timothee/git_clone/nim/nim-regex/src/regex/litopt.nim(140, 6)
     3463356       876  /Users/timothee/git_clone/nim/nim-regex/src/regex.nim(473, 9)
     3461240     15184  /Users/timothee/git_clone/nim/nim-regex/src/regex/nfamacro.nim(562, 6)
     3249920     46364  /Users/timothee/git_clone/nim/nim-regex/src/regex/nodematch.nim(110, 6)
     2801514   2670768  /Users/timothee/git_clone/nim/nim-unicodedb/src/unicodedb/types.nim(20, 6)
     2593027    966009  /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(161, 6)
     2459262    238399  /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(65, 6)
     2201504       553  /Users/timothee/git_clone/nim/nim-regex/src/regex/nodematch.nim(10, 6)
     2062921   1242678  /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(196, 6)
     1742430    269189  /Users/timothee/git_clone/nim/nim-regex/src/regex/parser.nim(742, 6)
     1612111    188082  /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(294, 6)
     1498531   3173322  /Users/timothee/git_clone/nim/nim-unicodedb/src/unicodedb/types.nim(38, 10)
     1376718     54650  /Users/timothee/git_clone/nim/Nim_devel/lib/system.nim(647, 6)
     1033561    112395  /Users/timothee/git_clone/nim/nim-regex/src/regex/parser.nim(648, 6)
      871969    281895  /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(446, 6)
      858437    327774  /Users/timothee/git_clone/nim/nim-regex/src/regex/litopt.nim(92, 6)
      849408    237897  /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(416, 6)
      791580    168000  /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(185, 6)

but it's to be taken with a grain of salt due to implementation of profileVM

example 2: self contained benchmark

this test shows a 20_000 X slowdown in VM

when defined case2:
  #[
  nim r -d:case2 -d:danger --benchmarkVM $timn_D/tests/nim/all/t11918.nim
  nim r -d:case2 -d:danger --benchmarkVM --hints:off $timn_D/tests/nim/all/t11918.nim
(5.57392, 459)
(0.0003050000000000006, 459)

nim --eval:'echo 5.57392/0.0003050000000000006'
18275.14754098357
  ]#
  import pkg/regex
  import std/random
  import std/times
  proc main()=
    # let n = 1000
    let n = 2000
    let m = 1
    var r = initRand(123)
    var s = newString(n)
    for i in 0..<n:
      s[i] = cast[char](r.rand(128))
    var pat = r"\w+"
    let reg = re(pat)
    var c = 0
    let t = cpuTime()
    for j in 0..<m:
      for i in 0..<s.len:
        let a = startsWith(s, reg, i)
        c += a.ord
    let t2 = cpuTime() - t
    echo (t2, c)
  static: main()
  main()

proposal

JIT-compiling could be an option, enabled by https://github.com/timotheecour/Nim/issues/598; this is analog to how vmops.hashVmImpl speeds up in VM a key part of hashing in std/hashes, by running native code instead of VM code

question

is there 1 or a few procs that could be run natively, so that the rest of code at CT would execute fast? In other words, is there an equivalent to hashes.hashVmImpl, hashes.hashVmImplByte in nim-regex?

ideally a proc that's:

not generic (so that it's easy to dlopen)
has easy to convert params between their value and corresponding PNode (see vmconv)
is used by everything else so that performance at CT improves overall

nitely commented 3 years ago

IIRC, about 400ms is the compilation itself. The culprit is unicodedb, the unicode data tables take that long to compile. IC should fix that.

nim c -d:danger tests/tests.nim will run the 3k tests at compile time.

is there 1 or a few procs that could be run natively, so that the rest of code at CT would execute fast? In other words, is there an equivalent to hashes.hashVmImpl, hashes.hashVmImplByte in nim-regex?

No, I don't think there is. Params/result are too complex. I'll have this is mind when iterating on nregex, though. nregex improvements tend to be ported to nim-regex.

timotheecour commented 3 years ago

IC should fix that.

I'm not holding my breath ;-) ; maybe there's something that can be improved regardless of IC?

The culprit is unicodedb, the unicode data tables take that long to compile. IC should fix that.

that seems exactly like the kind of thing where a vmhook would help, assuming it indeed is a bottleneck, eg proc unicodeTypes*(cp: Rune): int = has simple input/output types

I'll have this is mind when iterating on nregex, though. nregex improvements tend to be ported to nim-regex.

TIL about https://github.com/nitely/nregex ; btw, do you have a page (github issue / doc whatever) containing a comparison of nregex vs nim-regex, when to use which one, etc?

nitely commented 3 years ago

that seems exactly like the kind of thing where a vmhook would help, assuming it indeed is a bottleneck, eg proc unicodeTypes*(cp: Rune): int = has simple input/output types

unicodedb has a few const fooTable = [big array], that's what takes 400ms to compile. To reproduce it, you just need to import pkg/unicodedb/properties, and pkg/unicodedb/types, there is no need to call any function. If I understand vmhook only works on functions, no?

TIL about https://github.com/nitely/nregex ; btw, do you have a page (github issue / doc whatever) containing a comparison of nregex vs nim-regex, when to use which one, etc?

nregex is just a playground for my regex experiments, don't use it. I'm working on a TDFA, and I'll likely break all APIs. I think it only makes sense to use it for very simple regex matching. Find/findAll are slow. Captures/submatches, and assertions (\b, $, ^) will use an hybrid DFA/NFA and it's fairly slow. The biggest difference will be once I improve the submatches space complexity, but that's just one part of the puzzle. So, ignore it for now.

timotheecour commented 3 years ago

that's what takes 400ms to compile.

ok, I was more concerned about the 51 seconds for nim c -d:danger tests/tests.nim, not those 400ms (1 time cost). And IC won't help with those 51 seconds, what's at play is slow VM execution of those tests at CT, for which vmhooks could help.

If I understand vmhook only works on functions, no?

you can also dlopen variables, so values would work too.

If there's no obvious internal proc that's called by all others, we can also wrap user facing API

example: `replace`

we have:

func replace(s: string; pattern: Regex; by: string; limit = 0): string

the problem is Regex object which is complex

option 1: easy

1 easy option is to add an overload:

func replace(s: string; pattern: string; by: string; limit = 0): string {.vhmhook.}

when called at CT (eg via const a = replace("foo", "fo\w+", ".", 1)), replace would then run with native code, calling pattern(s, re(pattern), by, limit)

this will definitely work and wouldn't be hard, but it adds an overload and won't work transparently for existing code

option 2

we could avoid need for adding an overload with pattern: string; the nim compiler already does something similar with {.magic.} types, which allows defining them specially at CT:

type
  UncheckedArray*[T]{.magic: "UncheckedArray".}
type
  Ordinal*[T] {.magic: Ordinal.} ## Generic ordinal type. Includes integer,
type
  NimNodeObj = object
  NimNode* {.magic: "PNimrodNode".} = ref NimNodeObj

I don't have a complete answer for option 2 but it should be possible

nitely commented 3 years ago

Are you concerned about the tests being slow, or about nim-regex being slow at CT? It's not the same, because the current APIs need to be tested at CT, so even if we implement option 1, it won't make the tests fast.

The issue I see is there is no way to compile the pattern at compile time, at least not with option 1.

timotheecour commented 3 years ago

Are you concerned about the tests being slow, or about nim-regex being slow at CT?

I'm not concerned about the tests's running time, since they don't affect user code; I'm concerned about nim-regex being slow at CT as it can affect user code.

the current APIs need to be tested at CT, so even if we implement option 1, it won't make the tests fast.

i agree with that.

The issue I see is there is no way to compile the pattern at compile time, at least not with option 1.

option 1 can work like this:

proc main =
  let a = replace(input, "\w+", "-")  
static: main()

replace then gets intercepted using similar registration mechanism as in vmops, which then calls:

# this will be compiled into a dll and dlopened (once) via vmhook mechanism:
proc replaceVmHook(input, pattern, by): string =
  result = replace(input, re(pattern), by)

which runs at CT (from the perspective of client code) but using native code.

Furthermore, replaceVmHook can cache re(pattern) so that subsequent invocations with the same pattern can be reused, but I'm not sure this optimiation is needed.

nitely commented 3 years ago

What I mean is that API will compile the regex at runtime every time it's called, when called at runtime.

nitely commented 3 years ago

What I mean is that API will compile the regex at runtime every time it's called, when called at runtime.

I think this can be workaround if we can do something like:

when nimvm:
  func replace(s: string; pattern: string; by: string; limit = 0): string {.vhmhook.}
else:
  func replace(s: string; pattern: static string; by: string; limit = 0): string
     replaceImpl  # compile the regex at CT?

or even without the when clause, if overloading works in this case (idk).

timotheecour commented 3 years ago

something like this could work, overloading on whether pattern is a static param:

func replace(s: string; pattern: string; by: string; limit = 0): string {.vhmhook.}
func replace(s: string; pattern: static string; by: string; limit = 0): string {.vhmhook.}

or simply require user to make this explicit, eg:

func replace(s: string; pattern: string; by: string; limit = 0): string {.vhmhook.}
func replace(s: string; pattern: CtReg; by: string; limit = 0): string {.vhmhook.}

const r: CtReg = re"\w+" # now pkg/regex doesn't need to deal with caching, it's user responsability
let a = replace("ab", r, "cd")

nitely commented 2 years ago

const r: CtReg = re"\w+" # now pkg/regex doesn't need to deal with caching, it's user responsability

Is CtReg a complex type? I thought vhmhook would only support primitive types. If it's not a complex type, then it's not really caching the compiled regex.

This is a valid use case that cannot be supported unless the API takes a compiled regex:

func findInFiles(pattern: string, files: seq[File]): string =
  let reg = re(pattern)  # compile pattern
  for file in files:
    if find(file.content, reg):
      return file.name

if we only have find(s: string, pattern: string): bool then there is no way of caching the compiled regex.

nitely / nim-regex

regex very slow at CT #104

example 1

notes

profiling

example 2: self contained benchmark

proposal

question

example: `replace`

option 1: easy

option 2

nitely / nim-regex

regex very slow at CT #104

example 1

notes

profiling

example 2: self contained benchmark

proposal

question

example: replace

option 1: easy

option 2

example: `replace`