Closed BrazilForever11 closed 5 years ago
Hi there, @BrazilForever11,
The tutorial there just helps with initial set up of OpenCCG; I'm sorry I haven't already added something about using the broad coverage grammar.
Once you've downloaded OpenCCG, the instructions you need are in $OPENCCG_HOME/docs/ccgbank-README
, which you can also view on GitHub here. There it explains:
Since the pre-built English models and CCGbank data for training represent much larger downloads than the OpenCCG core files, they are available as separate downloads (where YYYY-MM-DD represents the date of creation): english-models.YYYY-MM-DD.tgz ccgbank-data.YYYY-MM-DD.tgz
So the relevant files from source forge are indeed:
english-models.2013-03-15.tgz
ccgbank-data.2011-11-09.tgz
available from https://sourceforge.net/projects/openccg/files/data/
As for how to set everything up, for now I recommend checking that ccgbank-README
The language model is a statistical language model which represents the probability of a particular sequence of words. Language models are not usually aware of syntax, but provide a convenient way to score the possible outputs of a rule-based generation system to ensure that the sequences of words you produce are reasonably likely to occur.
Hope this helps!
I am using the following installation tutorial. https://davehowcroft.com/post/installing-openccg/. I decided to download whole OpenCCG from sourceforge. The following link references https://sourceforge.net/projects/openccg/files/openccg/openccg%20v0.9.5%20-%20deplen%2C%20kenlm%2C%20disjunctivizer/ off-the-shelf English model. However, I cannot find it anywhere, neither on sourceforge nor on github.
Overall 2013-03-11.tgz - is English model gigaword4.5g.kenlm.bin is language model
What is the difference between English model and language model? Which one of the is lexicon?