Cannot finish simple task

Serhioromano commented 3 years ago

I try to build Markdown compiler. not actually compiler to machine code but to HTML, XML, JSON and other formats. Would be better to call it Markdown processor. I wanted it to be CLI tool that would work on any platform. When I read about gocc I thought that it would be ideal tool to do it. I want to make processor with a lot of new syntax goodies.

Anyway, I want to make a simple task. Here is my BHF. All I want is to find titles.

!whitespace : ' ' | '\t' | '\n' | '\r' ;
_nl : '\r\n' | '\n' ;

title6 : {_nl} '#' '#' '#' '#' '#' '#' {.} {_nl} ;
title5 : {_nl} '#' '#' '#' '#' '#' {.} {_nl} ;
title4 : {_nl} '#' '#' '#' '#' {.} {_nl} ;
title3 : {_nl} '#' '#' '#' {.} {_nl} ;
title2 : {_nl} '#' '#' {.} {_nl} ;
title1 : {_nl} '#' {.} {_nl} ;

Content :  title6                   << nil >>
    | title5                        << nil >>
    | title4                        << nil >>
    | title3                        << nil >>
    | title2                        << nil >>
    | title1                        << nil >>
;

And here is my GO file

package main

import (
    "fmt"
    "io/ioutil"
    "github.com/serhioromano/go-markdown/lexer"
    "github.com/serhioromano/go-markdown/token"
)

func main() {
    dat, err := ioutil.ReadFile("./example.md")
    check(err)

    l := lexer.NewLexer([]byte(dat))
    for tok := l.Scan(); tok.Type == token.TokMap.Type("title2"); tok = l.Scan() {
        fmt.Printf("1 %v", tok)
    }
}

func check(e error) {
    if e != nil {
        panic(e)
    }
}

and my markdown

# This is title

This is paragraph

## This is enother title

- List 1
- List 2
- List 3

But this only print data on first title and only if file begins with it. If I place few lines before it, it fails to find any title. What did I do wrong here?

awalterschulze commented 3 years ago

Thank you for using gocc. Sorry about the frustration caused. Parser generators have a learning curve, just know you are not alone.

Here are some oddities I spotted that might or might not be worth investigating:

In the SDT rules << nil >> you are not capturing anything
You also do not seem to be parsing any text that is not in a title.
I am also not sure about the use of {_nl}, zero or more new lines, maybe you would prefer one or more _nl {_nl}

Serhioromano commented 3 years ago

In the SDT rules << nil >> you are not capturing anything

I know that. I thought that my errors because I do not have those, that is why I added them.

You also do not seem to be parsing any text that is not in a title.

For now, I do not want to. All I want to find all titles inside the text, then I'll add more elements. I need to make working at least one.

I am also not sure about the use of {_nl}, zero or more new lines, maybe you would prefer one or more _nl {_nl}

Good point, thank you.

So do you know answer how to find title2 in my example?

awalterschulze commented 3 years ago

Sorry my experience isn't fresh enough to quickly spot the problem here. Also, I can't remember any time that I didn't parse into an AST.

kfsone commented 3 years ago

Hi, serhi; there are a couple of problems:

Your actions need to return a value and an error: << nil, nil >>
Your call to parse(...) is going to receive the left-value << nil, ... >> of the outermost match,
Order matters, very much; because you have !_whitespace before _nl, you create the possibility the parser will ignore the newline first.
You're not creating a regex here, you're creating a parser that will try to consume the entire document, so you need to tell it about things it can ignore between matches -- see !_whitespace (the ! means 'ignore').

I would suggest you start with just matching h1 and proceed from there.

Go Code:

package main

import "fmt"
import "github.com/kfsone/scratch/lexer"  // << replace with YOUR path
import "github.com/kfsone/scratch/parser"
//import "github.com/kfsone/scratch/token"

func main() {
        sample := "This\ncan be\r\nignored!\n# This is my header\r\nhello\n"
        l := lexer.NewLexer([]byte(sample))
        p := parser.NewParser()
        ast, err := p.Parse(l)
        if err != nil {
                panic(err)
        }
       fmt.Printf("ast value: %#+v\n", ast)
}

goccmack / gocc

Cannot finish simple task #126