crodas / LanguageDetector

PHP Class to detect languages from any free text
320 stars 67 forks source link

Commit data file from translatewiki.net #15

Open nemobis opened 10 years ago

nemobis commented 10 years ago

Made by us with translatewiki.net data. Gave great results, hope it's useful.

Copyright: http://creativecommons.org/publicdomain/mark/1.0/

nemobis commented 10 years ago

Where else to upload data files?

crodas commented 10 years ago

@nemobis can you email the raw texts to crodas@php.net? I don't know if we can attach zip here. If we have the raw texts (used for training) then I other people can use/improve it.

nemobis commented 10 years ago

@crodas, there is no such thing as raw text here, the data was extracted from the database of the wiki. We have a dump though https://archive.org/details/wiki-translatewikinet_w

crodas commented 10 years ago

Thanks @nemobis. I'll take a look and merge your pull tonight (I'm GMT-4)