dlwh / puck

Puck is a lightning-fast parser for natural languages using GPUs
www.scalanlp.org
Apache License 2.0
248 stars 29 forks source link

Training/Building grammar for a language different than English #5

Open kk00ss opened 9 years ago

kk00ss commented 9 years ago

Considering that : "We have provided the cascade of grammars used in the Berkeley Parser for English." Is there a way to obtain grammars for other languages for which the grammar is already created ? I've downloaded Berkeley Parser grammars, is there a way to obtain a list of grammars for Puck ? Thanks

dlwh commented 9 years ago

There's only english right now.

On Wed, Mar 25, 2015 at 12:35 PM, kk00ss notifications@github.com wrote:

Considering that : "We have provided the cascade of grammars used in the Berkeley Parser for English." Is there a way to obtain grammars for other languages for which the grammar is already created ? I've downloaded Berkeley Parser grammars, is there a way to obtain a list of grammars for Puck ? Thanks

— Reply to this email directly or view it on GitHub https://github.com/dlwh/puck/issues/5.

JimSEOW commented 7 years ago

can we use the e.g. German grammar (ger_sm5.gr) provided by Berkeley Parser for puck? - after converted to text format?

It seems to work except these files

num.binary num.unary numstates unary

Could you provide instruction on how to create these files based on the converted text files?

dlwh commented 7 years ago

i think it won't work all that well on unknown/rare words because of the way I did the lexicon, but otherwise it should. If it basically works, I can help with getting the lexicon patched in.

On Wed, Aug 24, 2016 at 1:55 AM, JimSw2016 notifications@github.com wrote:

can we use the e.g. German grammar (ger_sm5.gr) provided by Berkeley Parser for puck? - after converted to text format?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dlwh/puck/issues/5#issuecomment-241999384, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAloSQM1K_PrprRR-qeMi-__9SKsHJqks5qjAcBgaJpZM4D0wXP .

JimSEOW commented 7 years ago

HI David, I have compared the extracted text files of (a) ger_sm5.grammar -> same format as Puck's wsj_2.gr.binary (b) ger_sm5.lexicon-> same format as Puck's wsj_2.gr.lexicon (c) ger_sm5.splits-> same format as Puck's wsj_2.gr.hierarchy (d) ger_sm5.words -> same format as Puck's wsj_2.gr.words

If you have time, do consider creating the missing num.binary num.unary numstates unary

If this process works, this will create NEW POSSIBILITIES for BerkeleyParser communities through GPU which you pioneered.

philippzentner commented 6 years ago

Did anything happen here? Does it work for German now?