percyliang / sempre

Semantic Parser with Execution
Other
830 stars 300 forks source link

Overnight parsing - question about input files #109

Open codeashcode opened 8 years ago

codeashcode commented 8 years ago

I understood that that @mode "genovernight" can be used to dump set of (z,c) - (logical form, canonical utterances) and by paraphrasing we can get (z,c,x) - (logical form, canonical utterance, paraphrase utterances). This set of (z,c,x) can be divided into two files: .paraphrases.train.examples and .paraphrases.test.examples.

But there are more inputs needed to train the semantic parser and those are following:

a. .train.superlatives.example - superlative training (and test file too) b. .phrase_alignments - phrase alignment file c. .word_alignments.berkeley - word alignment file d. -ppdb.txt - ppdb model

How I can generate these files for my domain? Details about steps to produce these files will be really helpful.

Zhenshan-Jin commented 6 years ago

I have the same problem with it. Thanks!

BrijeshKaria commented 6 years ago

I am also stuck with the same question.

mmarinated commented 5 years ago

This year I also have this issue. Time didn't help to resolve it :) Maybe you have any suggestions?

ppasupat commented 5 years ago

Hmm.. I'm not familiar with the overnight package, but there are a few classes in edu.stanford.nlp.sempre.overnight that can be directly invoked (i.e., have the public static void main method).

Here are some guesses based on reading the code + looking at the files in lib/data/overnight/ (retrieved by calling ./pull-dependencies overnight):