ontodev / valve.js

VALVE in JavaScript
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Add regex to grammar #6

Closed beckyjackson closed 3 years ago

beckyjackson commented 3 years ago

This allows regex expressions in Perl format to be used in functions:

[operator]/regex/[flags]

... and

[operator]/regex/replacement/[flags]

The operator & flags are optional. The delimiter is always a forward slash.

I had to do some hacky things to allow escaped forward slashes to be used within the patterns and replacements, but it seems to work for all cases.

I wanted to just use negative lookbehind, but Nearley doesn't allow that.

There are two problems with this currently. The first one I think we can work around:

  1. Escape characters get doubled in the output, e.g., pat\/ern becomes pat\\/ern. I tried doing a replace, but I think this is actually happening when it gets printed out. We should test this in Python and see if it only includes the one escape character.
  2. You cannot use a pattern with an escaped slash at the very end for matching (it's OK in substitution). It looks like the grammar gets confused between a match and a substitution here and returns two results:
    $ nearley-test -q -i 'regex(s/pattern\//g)'  build/valve_grammar.js
    [ [ { type: 'function',
      name: 'regex',
      args:
       [ { type: 'regex',
           operator: 's',
           pattern: 'pattern\\',
           replace: '',
           flags: 'g' } ] } ],
    [ { type: 'function',
      name: 'regex',
      args:
       [ { type: 'regex',
           operator: 's',
           pattern: 'pattern\\/',
           flags: 'g' } ] } ] ]

@jamesaoverton could you let me know what you think? Currently we have some regex stuff in valve.py that I just parse in Python, but I think it would be cool to do this in the grammar.

jamesaoverton commented 3 years ago

Looks good.

I think we only want the s operator. If we require that, then that would distinguish the trailing slash case from an empty substitution. Do you want to do it that way?

beckyjackson commented 3 years ago

Yes! That perfectly solves the problem 😄