DmitrySoshnikov / syntax

Syntactic analysis toolkit, language-agnostic parser generator.
MIT License
605 stars 85 forks source link

First set computation - algorithm correct? #58

Open hediet opened 6 years ago

hediet commented 6 years ago

While studying your algorithm for computing the First set (source), I wondered whether it is correct.

Consider a grammar like this:

S -> A | x
A -> S | y

From my understanding of your algorithm, you initially set _firstSets[S] to the empty set. Thus, when doing the recursion on A, the First set of A is computed to be the union of { } (as _firstSets[S] is still empty) and { y }. However, the First set of A is { x, y }. What am I missing?

DmitrySoshnikov commented 6 years ago

Hm.. yeah, good catch. Even though the grammar itself seems is not parsable, the first set for both non-terminals should be {x, y}. I'll take a look into it, though it's an edge case, and such a grammar seems mostly is of theoretical interested -- doesn't mean we don't have to fix it though.

For the test grammar:

%%

S
  : A
  | 'x'
  ;

A
  : S
  | 'y'
  ;

I get the:

syntax-cli -g ~/g.g -m lalr1 -t -s first

First set:

┌─────────┬───────────┐
│ Symbol  │ First set │
├─────────┼───────────┤
│ $accept │ 'y', 'x'  │
├─────────┼───────────┤
│ S       │ 'y', 'x'  │
├─────────┼───────────┤
│ A       │ 'y'       │
├─────────┼───────────┤
│ 'y'     │ 'y'       │
├─────────┼───────────┤
│ 'x'     │ 'x'       │
└─────────┴───────────┘

Parsing mode: LALR1_BY_SLR(1).

Grammar:

    0. $accept -> S
    ----------------
    1. S -> A
    2.    | 'x'
    3. A -> S
    4.    | 'y'

LALR1_BY_SLR(1) parsing table:

┌───┬─────┬─────┬────────┬───┬───┐
│   │ 'x' │ 'y' │ $      │ S │ A │
├───┼─────┼─────┼────────┼───┼───┤
│ 0 │ s3  │ s4  │        │ 1 │ 2 │
├───┼─────┼─────┼────────┼───┼───┤
│ 1 │     │     │ acc/r3 │   │   │
├───┼─────┼─────┼────────┼───┼───┤
│ 2 │     │     │ r1     │   │   │
├───┼─────┼─────┼────────┼───┼───┤
│ 3 │     │     │ r2     │   │   │
├───┼─────┼─────┼────────┼───┼───┤
│ 4 │     │     │ r4     │   │   │
└───┴─────┴─────┴────────┴───┴───┘

Feel free to send a PR as well, in case you'll reach it faster than me.