LINBIT / csync2

file synchronization tool using librsync and current state databases
GNU General Public License v2.0
145 stars 39 forks source link

ambiguity between strings and keywords #32

Open ggs67 opened 3 years ago

ggs67 commented 3 years ago

Analyzing the csync2 parser for extending the auto command I stumbled over a problem with an ambiguity between unquoted strings and keywords. For example if a host is called "auto", the statement 'host auto' will fail requiring quotes because auto is a keyword.

If you want to give it a try I have modified the flex and bison files in the parser-path branch of my fork. I can also file a pull request.

https://github.com/ggs67/csync2/tree/parser-patch

Commit comment of modifications:

1. Disambiguation of strings

  Modified lexer with non-exclusive start state STRTOKEN for strings without quotes.
  This resolves the conflict where strings matching keywords will fail if not quoted
  (ex. 'host auto;' required 'host "auto";')

2. Tokenized auto-methods

  Modified auto-methods from separate C-parsing code to include those by separate tokens in the
  lexer.

  This change was done to cleanup parsing, becoming more readable and in prepartation to extend the
  auto command.
ggs67 commented 3 years ago

FYI: I identified some problems with my modifications. First I did add an unnecessary AUTO start state which I have already removed, but this is secondary (more cosmetic). I introduced this state because I thought it would be necessary for the eauto-feature, but I changed the syntax this it is no required anymore.

But another problem is the string handling which would not allow any unquoted string after a quoted string was found. This is due to the fact that "quote removal" is performed by the lexer and the related state changes. I will solve this by introducing as state stack. But this will probably only make it into the auto-feature branch