rufuspollock-okfn / reconcile-csv

A simple OpenRefine reconciliation service that runs on top of a CSV file
BSD 2-Clause "Simplified" License
117 stars 28 forks source link

Sort-map #19

Closed mihi-tr closed 9 years ago

mihi-tr commented 9 years ago

As discussed in #14 Right now the data structure in scoring becomes really large only to be sorted and limited again.

If we can replace a potentially O(N^2) (or O(nLog n)) sort by a series of smaller sorts, this could make reconciling a bit faster.

mihi-tr commented 9 years ago

Ok - this assumption was false. It does not speed up lookup in a dataset of about 4000 rows- it rather slows it down... (644msec vs. 488msec per term) :/

mihi-tr commented 9 years ago

Switching to version 0.1.3 of fuzzy-string actually made lookups much faster (around 100msec per term for 4000 lines)