zevv / npeg

PEGs for Nim, another take
MIT License
330 stars 22 forks source link

Use with threads and gcsafe code ? #28

Open mildred opened 3 years ago

mildred commented 3 years ago

Hi,

I am wondering how to use npeg with gcsafe code or threads. i declared my grammar constant, but I get errors like this when using the grammar match procedure:

Warning: 'decode_encoded_words' is not GC-safe as it calls 'match' [GcUnsafe2]

Is it possible to use npeg with gcsafe code ?

zevv commented 3 years ago

That's a good one, and I never gave it a thought. In theory I'd say that there is no reason why this should not work, as there is no global state involved. I'll try if I can spend some time to see why Nim thinks this is a problem.

zevv commented 3 years ago

Hm, I just tried the the brute-force approach of adding {.gcsafe.} pragma's deeper and deeper into the call stack to see the root cause, and now Nim tells me

npeg/src/npeg.nim(134, 4) Error: 'match' is not GC-safe as it performs an indirect call here

This is funny, because when your parser is const, the call is not really indirect. I'll see if I can find a way to tell the compiler "trust me I know what I'm doing", or discuss this with some wise people on the #nim channel.

mildred commented 3 years ago

When you make indirect calls, function pointers and closures, the closure must be annotated with {.gcsafe.} too, else nim static analysis can't tell if the closure is gcsafe or not. There is also a trick with forward declaration which must contain the annotation.

zevv commented 3 years ago

Sure, the closure also gets the pragma.

My current problem is that it is perfectly valid to access globals from parsers if you want to, but if I would mark the generated closure as {.gcsafe.}, all non-threaded code accessing globals from a parser will still generate warnings all the time.

mildred commented 3 years ago

Yes, that's a problem.

Perhaps the generated code can be templated on the closure type and allow two closures types, one with gcsafe and the other without. If the rest of the code is actually gcsafe, its status will be correctly inferred.

Araq pointed me to https://github.com/nim-lang/RFCs/issues/142 after https://forum.nim-lang.org/t/7110#44827

zevv commented 3 years ago

I think I can make a relatively simple fix for this.

Just to make sure I'm looking at the right thing, can you share a minimal complete example that shows your problem?

mildred commented 3 years ago

Minimal test case:

import npeg, strutils

const qdata = peg("data", res: string):
  hex        <- {'a'..'f'} | {'A'..'F'} | {'0'..'9'}
  enc_byte   <- '=' * >(hex * hex):
    res = res & parseHexStr($1)
  underscore <- >"_":
    res = res & " "
  char       <- >(1):
    res = res & $1
  data       <- *( enc_byte | underscore | char )

proc testnpeg() {.gcsafe.} =
  var result = ""
  if not qdata.match("Hello=20World", result).ok:
    echo "Failed to decode"
  else:
    echo result

testnpeg()

Compiler messages:

~/.nimble/pkgs/npeg-0.22.2/npeg/codegen.nim(420, 16) Hint: 'fail' is declared but not used [XDeclaredButNotUsed]
~/.nimble/pkgs/npeg-0.22.2/npeg/codegen.nim(423, 16) Hint: 'push' is declared but not used [XDeclaredButNotUsed]
~/.nimble/pkgs/npeg-0.22.2/npeg/codegen.nim(417, 16) Hint: 'validate' is declared but not used [XDeclaredButNotUsed]
~/.nimble/pkgs/npeg-0.22.2/npeg.nim(135, 4) Warning: 'match' is not GC-safe as it performs an indirect call here [GcUnsafe2]
testnpeg.nim(13, 6) Warning: 'testnpeg' is not GC-safe as it calls 'match' [GcUnsafe2]

With nim 1.4.0

zevv commented 3 years ago

Right; I made a gcsafe branch which can now be forced into gcsafe mode by passing the -d:npegGcsafecompilation flag. This then adds the proper pragma to the indirect call and makes your example work.

I wonder if we can make this automatically inferred somehow...

mildred commented 3 years ago

Thank you, I did not expect a quick response like that.

I don't know how it can be made easier (you may want to use gcsafe peg in some portion of your code and not gcsafe in another), but this is already helping. I believe Nim will have to think about what it will want to do with gcsafe.

zevv commented 3 years ago

Merged and released in 0.24.0

turbo commented 1 year ago

I still get an error after applying the compiler argument to force gcsafe. Before the argument, I get the expected failure:

fsquery.nim(173, 6) Error: 'parseSTS5' is not GC-safe as it calls 'match'

But after adding it:

fsquery.nim(119, 20) template/generic instantiation of `peg` from here
/Users/turbo/.nimble/pkgs2/npeg-1.2.1-c5451eea5d1f656700f04ad4b234d99ab0b78ed7/npeg/codegen.nim(443, 53) Error: type mismatch: got 'proc [*missing parameters*](ms_NP: var MatchState, s_NP: openArray[char], userdata: var STS5Config): MatchResult' for 'fn_run`gensym199' but expected 'proc (ms: var MatchState[system.char], s: openArray[char], u: var STS5Config): MatchResult[system.char]{.closure, gcsafe.}'
  Calling convention mismatch: got '{.nimcall.}', but expected '{.closure.}'.
  Pragma mismatch: got '{..}', but expected '{.gcsafe.}'.

It seems like the generated proc is not getting the annotation that is required. My code follows a similar pattern from the example above:

PEG is a const at module level:

type
  STS5Config = object
    filterId: Option[string]
    minConstExpr: int = 3

const SafeTS5 = peg("query", userdata: STS5Config):
  # ...

That is called from another proc (proc parseSTS5*(filterId: string, unsafe: JsonNode): string =), which is in turn evoked from a worker thread of a web server (so has to be gcsafe).

turbo commented 1 year ago

Here's a workaround which is a bit hacky. I needed to:

An alternative to the second point is to wrap calls in match in a gcsafe block:

{. gcsafe .}:
    var ms = p.fn_init()
    p.fn_run(ms, s, userData)

Weirdly, the actual flag provided here npegGcsafe never had any effect for me. Applying it to any scenario did not change the outcome (all fails still failed, all workarounds still worked in all permutations). So I'm not sure it achieves the intended purpose.

zevv commented 1 year ago

Hmm, npeg has seen some major refactorigns since the npegGcsafe flag was added, and I think it's not part of the unit tests; my guess is that indeed stuff is just broken in the current code. I'll reopen the ticket and see if I can get this fixed.

turbo commented 1 year ago

(FYI, I'm using Nim 2, where ORC, threads is default)