linkTDP / boa

Automatically exported from code.google.com/p/boa
0 stars 0 forks source link

Fix segmentation in typicity feature #4

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. wiki-corpora segment strings while indexing
2. news-corpora don't segment strings while indexing
3. In typicity feature all sentences get segmented again, which is in the wiki 
case unnecessary

Fix this issue at best with segmentation while indexing, for the current 
version get to know on which corpus one is working and the decide 

Original issue reported on code.google.com by gerb...@googlemail.com on 30 Nov 2011 at 11:49

GoogleCodeExporter commented 9 years ago
We would need to create a new module which is able to index a file line by line

Original comment by gerb...@googlemail.com on 5 Jul 2012 at 3:32

GoogleCodeExporter commented 9 years ago

Original comment by gerb...@googlemail.com on 15 Mar 2013 at 10:17