OpenCCG / openccg

OpenCCG library for parsing and realization with CCG
http://openccg.sourceforge.net/
Other
205 stars 45 forks source link

finding off-the-shelf English model 2013-03-11.tgz #26

Closed BrazilForever11 closed 5 years ago

BrazilForever11 commented 5 years ago

I am using the following installation tutorial. https://davehowcroft.com/post/installing-openccg/. I decided to download whole OpenCCG from sourceforge. The following link references https://sourceforge.net/projects/openccg/files/openccg/openccg%20v0.9.5%20-%20deplen%2C%20kenlm%2C%20disjunctivizer/ off-the-shelf English model. However, I cannot find it anywhere, neither on sourceforge nor on github.

Overall 2013-03-11.tgz - is English model gigaword4.5g.kenlm.bin is language model

What is the difference between English model and language model? Which one of the is lexicon?

dmhowcroft commented 5 years ago

Hi there, @BrazilForever11,

The tutorial there just helps with initial set up of OpenCCG; I'm sorry I haven't already added something about using the broad coverage grammar.

Once you've downloaded OpenCCG, the instructions you need are in $OPENCCG_HOME/docs/ccgbank-README, which you can also view on GitHub here. There it explains:

Since the pre-built English models and CCGbank data for training represent much larger downloads than the OpenCCG core files, they are available as separate downloads (where YYYY-MM-DD represents the date of creation): english-models.YYYY-MM-DD.tgz ccgbank-data.YYYY-MM-DD.tgz

So the relevant files from source forge are indeed:

available from https://sourceforge.net/projects/openccg/files/data/

As for how to set everything up, for now I recommend checking that ccgbank-README

The language model is a statistical language model which represents the probability of a particular sequence of words. Language models are not usually aware of syntax, but provide a convenient way to score the possible outputs of a rule-based generation system to ensure that the sequences of words you produce are reasonably likely to occur.

Hope this helps!