ajenhl / tacl

Tool for performing basic text analysis on the CBETA corpus
GNU General Public License v3.0
30 stars 9 forks source link

extend does not cope with the same text existing under multiple labels #40

Closed ajenhl closed 9 years ago

ajenhl commented 9 years ago

When creating extended n-grams, the process makes use of all of the matches for a given witness, not considering that with supplied queries the same witness may exist under more than one label.