ajenhl / tacl

Tool for performing basic text analysis on the CBETA corpus
GNU General Public License v3.0
30 stars 9 forks source link

"Supplied" queries don't work correctly #22

Closed ajenhl closed 9 years ago

ajenhl commented 9 years ago

A tacl diff with supplied results returns potentially incorrect results, since it returns only those rows in the supplied results that contain n-grams that are not in any of the texts labelled in the catalogue. That is, no diff between the labelled texts is performed.

The whole way supplied queries are handled is bad and wrong. This functionality should be replaced with the ability to run a diff or intersect query over two supplied sets of results (each of which has their labels replaced with one supplied by the user).