antedeguemon / earleyparser

a simple earley parser
7 stars 1 forks source link

How can I add my own grammar to Grammar? #3

Open phucthuan1st opened 2 years ago

phucthuan1st commented 2 years ago

I have a grammar like this:

 grammar = {
        'S':           [['NP', 'VP'], ['NP', 'VP', 'PP']],
        'NP':          [['d', 'NP3']],
        'NP3':         [['a', 'NP3'], ['n'], ['n', 'PP']],
        'PP':          [['p', 'NP2']],
        'NP2':         [['d', 'NP3']],
        'VP':          [['v']],
        'd':         ["the", "a", "an", "this", "that", "these", "those", "my", "your", "his", 
                        "her", "its", "our", "their", "a few", "a little", "much", "many", "a lot of", 
                        "most", "some", "any", "enough",
                        "all", "both", "half", "either", "neither", "each", "every",
                        "other", "another", "such", "what", "rather", "quite"],
        'n':        nouns,
        'v':        verbs,
        'aux':         ['do', 'does', 'did', 'had', 'have', 'has'],
        'p':        preps,
        'a':        adjs
    }

how can I add it to the code?

antedeguemon commented 2 years ago

Hey @phucthuan1st, thank you for your interest in this old God-forgotten library!

Can you please post your grammar using the Backus-Naur Form representation? It's been a looong time since I don't touch this project nor play with formal languages/automata theory, but I've added an example in the README.

Copy-pasting it here:

S ::= A N
A ::= happy | yellow | red
N ::= dog | cat

Is described by:

grammar = earleyparser.Grammar('S')

# Sentence is composed of `<adjective> <noun>`
grammar.add('S', ['A', ' ', 'N'])

# Adjective is `happy`, `yellow` or `red`
grammar.add('A', ['h', 'a', 'p', 'p', 'y'])
grammar.add('A', ['y', 'e', 'l', 'l', 'o', 'w'])
grammar.add('A', ['r', 'e', 'd'])

# Noun is `dog` or `cat`
grammar.add('N', ['d', 'o', 'g'])
grammar.add('N', ['c', 'a', 't'])

I hope it helps you!

phucthuan1st commented 2 years ago

hi Merlo, As I described above, each pair key value in grammar dictionary equal to a BNF representation, such as:

'S':           [['NP', 'VP'], ['NP', 'VP', 'PP']]

equal to

S ::= NP VP
S ::= NP VP PP

Each of verbs, nouns, adjs or preps is a list contain multiple of word read from data file, described same as your example of A or N (with much more number of word) Is that clear for you?

Thank for responding me ^^

antedeguemon commented 2 years ago

Hey @phucthuan1st, no problem!

My apologies this library is so raw. It was something I did back in college a long time ago. If you need it for any real life scenario, you should really consider using some other library or rewriting this one.

Here is a kickstart for your grammar:

import earleyparser

grammar = earleyparser.Grammar('S')

grammar.add('S', ['NP', 'VP'])       # S ::= NP VP
grammar.add('S', ['NP', 'VP', 'PP']) # S ::= NP VP PP

grammar.add('NP', ['d', 'NP3'])  # NP ::= d NP3
grammar.add('NP3', ['a', 'NP3']) # NP3 ::= a NP3
grammar.add('NP3', ['n'])        # NP3 ::= n
grammar.add('NP3', ['n', 'PP'])  # NP3 ::= a NP3

# ... other rules

Since strings are not accepted as terminals, you will need to manually break your strings into lists. It should go as follow:

grammar.add('AUX', ['d', 'o'])           # AUX :== do 
grammar.add('AUX', ['d', 'o', 'e', 's']) # AUX :== does
grammar.add('AUX', ['d', 'i', 'd'])      # AUX :== did

After that, you can get the derivation tree:

parser = earleyparser.Parser(grammar)
parser.print_derivation_tree("some sentence")

I hope it helps you!