wooorm / franc

Natural language detection
https://wooorm.com/franc/
MIT License
4.07k stars 175 forks source link

Accuracy #23

Closed sospedra closed 8 years ago

sospedra commented 8 years ago

I'm not reporting an issue at all but I want to know if I'm missing something or what.

Check this out:

franc.all('drink some coffee', { whitelist: ['eng', 'spa'] });

// outputs
[ [ 'spa', 1 ], [ 'eng', 0.949748743718593 ] ]

Where the main competitor cld from Google (the one you mentioned on the README.md) outputs the following:

cld.detect('drink some coffee', function(err, data){  
  return data;
});

// outputs
{ reliable: true,
  textBytes: 19,
  languages: [ { name: 'ENGLISH', code: 'en', percent: 94, score: 1194 } ],
  chunks: [] }

Is this the Franc's accuracy? Because is far beyond to be correct.

wooorm commented 8 years ago

Yes. It’s accuracy. See #8 and #16 for more info. I think Google’s is especially made for short text. This focusses on more languages and longer input.