Characters not in the language appears in profile

What steps will reproduce the problem?
1.Default profiles
2.Open profile with notepad
3.あ seen in Chinese profile

What is the expected output? What do you see instead?

What version of the product are you using? On what operating system?
The 2014/3/3 one

Please provide any additional information below.
Can add a mechanism to not include characters not in the language into profile, 
when creating profile. This may be able to reduce noises/wrong detection?

Add function to create profile from multiple sources/files, as everyday phrases 
may not be in wikipedia.

Combination of short message profiles and normal profiles?

Dictionary based/script type based detection if no feature found in text? Many 
times I get "no feature found in text" for short texts, with default profile. 
Output of probable language in tuples/2D arrays?

Original issue reported on code.google.com by dennis97...@gmail.com on 11 Aug 2014 at 5:01

hohieukn / language-detection

Characters not in the language appears in profile #71