medic / lucene-query-generator

Generate escaped lucene query strings
Apache License 2.0
16 stars 7 forks source link

Always quote strings. #3

Closed dynajoe closed 8 years ago

dynajoe commented 9 years ago

Prevents tokenization on non word characters such as @ in an email address. I also ensured that calling str.replace didn't result in a trying to call an undefined function if the value is not a string.

garethbowen commented 9 years ago

Hi @joeandaverde . Initially I quoted strings too but this means you can't use lucene wildcards, eg: name:joe*. This was removed in commit: https://github.com/medic/lucene-query-generator/commit/063bdd01b903a4344273be471aabc8f4ba3fca34

dynajoe commented 9 years ago

Another option might be to have another format for quoted strings or vice versa.

dynajoe commented 9 years ago

I'd like to be able to specify a query like this:

email:test@test.com <-- the query analyzer likely breaks this up into a couple tokens and overmatching.

If you have another idea that'd be great.

garethbowen commented 9 years ago

I think the cleanest way to do this would be to define another schema type but I'm having trouble coming up with a name. Something like "exactString" or "literalString" or "uninterpretedString" might work?

Another way would be to allow the schema entry to be an object so you could say: email: { type: 'string', allowSpecialCharacters: true }. This is probably easier to read and easier to extend in future if required.

dynajoe commented 9 years ago

I like both of your ideas. I can see the second idea bout using an object having the ability to specify a formatter function too. However, could allow for malformed queries.

garethbowen commented 9 years ago

Actually maybe the correct way is simply to add @ to the list of characters? If it's ending up tokenized then it's something we want to escape. This would mean your email address would get wrapped in quotes which is what you were after, but other strings would remain unescaped.

garethbowen commented 8 years ago

Merged in https://github.com/medic/lucene-query-generator/pull/7