can't search with utf-8 characters

Meteor-Community-Packages / meteor-autocomplete

Client/server autocompletion designed for Meteor's collections and reactivity.

https://atmospherejs.com/mizzao/autocomplete

MIT License

351 stars 109 forks source link

can't search with utf-8 characters #32

Open ppillip opened 10 years ago

ppillip commented 10 years ago

plz

mizzao commented 10 years ago

Please provide some more information. What exactly goes wrong when you try to use UTF-8 characters?

ppillip commented 10 years ago

there is some sports name , for example, 'baseball' , 'football' , '태권도' , '권투'

'태권도', '권투' it do not work. (it's korean)

mizzao commented 10 years ago

I understand that, but "do not work" is not really going to help solve the problem. Do you see any errors in the console, etc.

ppillip commented 10 years ago

Unfortunately, there is no errors , i guess that "caretposition" problem. so, i decided to use typeahead.

mizzao commented 10 years ago

okay then.

Just so you are aware, typeahead is not backed by Meteor Collections. So you will find it quite a bit harder to use if you are pulling fetch() arrays out of Meteor.

If you decide to dig into this deeper, feel free to revisit this issue.

nicejwjin commented 10 years ago

I think I got a same problem though and type ahead looks working well simply so far. (And Meteor Collections or MongoDB has no issues also about the Character encoding stuff.)

Btw, so what I wanna know is that 'Autocomplete module has applied and followed this UTF-8 encoding already? In other word, when the module searches the character, isn't there any problem with another language without English?

Thanks to provide this kind of module, but hopefully it could be applied for all encoding types soon-

mizzao commented 10 years ago

Hi @nicejwjin, I'd be happy to help if you guys would create a demo app with some Korean characters to test with. I have no idea what the issue is right now and if you could narrow it down, I would be able to help you fix it.

We're using Meteor Collections to do a search with $regex directly so I don't see any reason why it shouldn't work if it is fine with Mongo/Minimongo.

LeePower commented 9 years ago

I've also ran into this problem, it is caused by the Regex when parsing Unicode

I then modify this line

new RegExp('(^|\\b|\\s)' + rule.token + '([\\w.]*)$')

return new RegExp('(^|\\b|\\s)' + rule.token + '([\\u2E80-\\uFB00\\w.]*)$');

Now it works perfectly with Unicoded Korean, Chinese and Japanese, hope this information helps!

Cheers!

mizzao commented 9 years ago

May be related to #85; will look into how to generalize.

superaleh commented 9 years ago

All right, this is related to this. Now the search works only on the Latin alphabet with the Cyrillic alphabet does not work.

mizzao commented 9 years ago

What is the regex that we should use to allow for Unicode with other characters? Obviously we have suggestions for Cyrillic and CJK already, but it would be nice to support whatever language.