Consistency on ISO standards for easier integration.

wooorm / franc

Natural language detection

https://wooorm.com/franc/

MIT License

4.07k stars 175 forks source link

Consistency on ISO standards for easier integration. #54

Closed RafaPolit closed 6 years ago

RafaPolit commented 6 years ago

Revisiting #10 I think its great that you support other languages not found in any of the ISO standards.

But to those that can be found, the fact that Franc sometimes returns the 2T and others the 2B , makes it really hard to map without huge lists.

For instance:

arm matches 2B for Armenian but not 2T nor 3 which are 'hye'
ces, on the other hand, matches 2T and 3 while 2B is 'cze'

So it makes for difficult integration with standards that you return one or the other without consistency.

I agree that with languages you wouldn't find, then we must find a solution and it is great! But for those that match, adhering to one or the other would be very helpful.

Thanks, best regards, Rafa.

wooorm commented 6 years ago

The returned values are from ISO 639-3. If anything is inconsistent, it’s ISO 639-2 with is double codes!

wooorm commented 6 years ago

@RafaPolit I think this isn’t an issue, so I’m closing it, but feel free to reply below if you disagree!

RafaPolit commented 6 years ago

@wooorm, Thanks for the answer.

It is not that I disagree, it is that it is not so! You are not being consistent with 639-3 as noted above: for example, arm which you are using for Armenian is not a 639-3 accepted code. hye is the accepted 639-3 code:

wooorm commented 6 years ago

Where do you see arm being used, instead of hye?

On the website, when pasting in some Armenian, I get hye back:

RafaPolit commented 6 years ago

I apologize, we must have taken the info from an outdated or simply plain wrong list.

You are correct, hye is being returned for Armenian.

Sorry for the inconvenience, I'll double check my data in further issues!

wooorm commented 6 years ago

No problem, maybe you’re using an old version of franc? Or something else? Let me know if this still is a problem!