LanguageMachines / ticcltools

Tools for TICCL
GNU General Public License v3.0
14 stars 3 forks source link

TICCL-chain should NOT process .ranked files wtih clip > 1 #38

Closed kosloot closed 5 years ago

kosloot commented 5 years ago

TICCL-chain is designed to work on best-first ranked output files. So with clip = 1. You can provide other input, and chaining will follow, nut with unclear (and dead-wrong) outcome.

Best is to actively check this and forbid multiple Correction Candidates per word.

OR: Have chain use only the highest ranked CC.

kosloot commented 5 years ago

Ok, we now just use the first (highest ranked) entries for each word.

martinreynaert commented 5 years ago

So, does this mean that even if the user inadvertently gives TICCL-chain a ranked list with say the best-five ranked CCs, only the first will be used in the chaining?

This is a nice solution! Thanks!

TICCL-chain is still fast, now, btw. It just did about 8M pairs in just 2 minutes. Great!

kosloot commented 5 years ago

indeed!