dachev / node-cld

Language detection for Javascript (Node). Based on the CLD2 (Compact Language Detector) library from Google.
Apache License 2.0
316 stars 55 forks source link

Duplicate Chunks #37

Closed kouga00 closed 5 years ago

kouga00 commented 7 years ago

Hello, in some cases, depends on string, the result contains duplicates:

{ reliable: true, textBytes: 938, languages: [ { name: 'ITALIAN', code: 'it', percent: 99, score: 420 } ], chunks: [ { name: 'ITALIAN', code: 'it', offset: 0, bytes: 170 }, { name: 'ITALIAN', code: 'it', offset: 310, bytes: 236 }, { name: 'ITALIAN', code: 'it', offset: 679, bytes: 257 } ] }

dachev commented 5 years ago

I see 3 non-overlapping chunks in this result so I am not sure what makes them "duplicate".