dlwh / puck

Puck is a lightning-fast parser for natural languages using GPUs
www.scalanlp.org
Apache License 2.0
248 stars 29 forks source link

Why should the sentence length be <= 40 characters? #1

Closed markfarrell closed 10 years ago

markfarrell commented 10 years ago

Hey,

Could you clarify (on the README) why the sentence length must be <= 40 characters right now?

What happens when sentences are ~100 characters in length?

dlwh commented 10 years ago

40 words, not 40 characters.

Currently they're skipped and (()) is printed. You can set maxParseLength to whatever you want and it will work.

Parsing is cubic in the length of the sentence, so long sentences take a very long time.

markfarrell commented 10 years ago

I see, thanks for the clarification.

dlwh commented 10 years ago

one more thing: the vast majority of sentences in most text are length ≤40. Like 95% for newspapers, more for general internet, perhaps somewhat less for scientific papers.

On Fri, Jul 11, 2014 at 9:41 AM, Mark Farrell notifications@github.com wrote:

I see, thanks for the clarification.

— Reply to this email directly or view it on GitHub https://github.com/dlwh/puck/issues/1#issuecomment-48753749.