Closed barzerman closed 11 years ago
...which perfectly makes sense:
LAND LIFE
has 7 ngrams, LANDLIFE
has 6 ngrams, of which only 4 correspond to the original LAND LIFE
grams (LAN, AND, LIF, IFE). 4/7 is 0.57.
it's too low in this particular case .
On Mon, Apr 1, 2013 at 3:53 PM, Georg Rudoy notifications@github.comwrote:
...which perfectly makes sense: LAND LIFE has 7 ngrams, LANDLIFE has 6 ngrams, of which only 4 correspond to the original LAND LIFE grams (LAN, AND, LIF, IFE). 4/7 is 0.57.
— Reply to this email directly or view it on GitHubhttps://github.com/barzerman/barzer/issues/524#issuecomment-15733522 .
www.barzer.net
right now i only perform one sort of normalization (soft normalization) what we can do actually is this: if no high coverages in BENI, remove all spaces and reapply BENI ...
what's the status of this?
the query still returns 0.57 as a cov and translator drops beni results
Ive pushed some fixes today, works just fine. Not in master.
Closing as fixes were pushed and no objections/comments yet.
the difference is one extra space, which produces extremely low coverage of .57
http://eu.barzer.net/translate?&key=BHjFDiC0QdoyDF7DBVn1rLWu0LaKRi8QeKiVSSSW&query=land%20life