Closed sbhaktha closed 8 years ago
Comment by chrisc36 Fri Jun 19 20:55:42 2015
This now also allows you to generalize capitalized words
Comment by dirkgr Mon Jun 22 22:44:04 2015
I broke this.
((?:NP PP|ADJP)* NN+) $eats ((?:NP PP|ADJP)* NN+)
over all seven corpora, while on the "eating" table.It gives me an ArrayIndexOutOfBoundsException
.
Comment by chrisc36 Tue Jun 23 18:26:53 2015
Bug is fixed, but that query revealed a sneakier bug in how samples are gathered that will take a bit more effort to work around
Comment by chrisc36 Tue Jun 23 20:51:10 2015
Should be good to go, we don't get broadening suggestions for "((?:NP PP|ADJP)* NN+) $eats ((?:NP PP|ADJP)* NN+)" but we at least get correct statistics and no errors.
Comment by dirkgr Tue Jun 23 21:53:47 2015
It duplicates the (?: ... )
groups, every time I ask for a suggestion. Is that harmful?
Comment by chrisc36 Wed Jun 24 17:22:20 2015
Fixed the (?: ... ) being replicated, I will go ahead and merge
Comment by dirkgr Wed Jun 24 17:26:05 2015
Yay!
Let me know when I should deploy.
On June 24, 2015 at 10:22:21, Christopher Clark (notifications@github.com) wrote:
Fixed the (?: ... ) being replicated, I will go ahead and merge
— Reply to this email directly or view it on GitHub.
Issue by chrisc36 Fri Jun 19 20:18:28 2015 Originally opened as https://github.com/allenai/okcorpus/pull/152
This adds two important optimizations: 1: Parallelized fetch/processing of examples 2: Optimized query strategy for gettings labelled samples for one column tables
Also contains a couple of minor bufixes / improvement to the default settings
chrisc36 included the following code: https://github.com/allenai/okcorpus/pull/152/commits