Open GoogleCodeExporter opened 9 years ago
The approach you took for case-insensitive queries on an indexed string, is
actually the way to get the best performance.
There are two ways actually, the alternative one uses less memory but isn't
quite as fast in all cases.
The first one which you described: you add an extra field in the POJO to store
a lowercase version of the string in the POJO, and then define an attribute on
the lowercase version:
public static final Attribute<Car, String> NAME = new SimpleAttribute<Car, String>("name") {
public String getValue(Car car) { return car.nameInLowercase; }
};
Alternatively, you could define the attribute as a function on the mixed-case
string:
public static final Attribute<Car, String> NAME = new SimpleAttribute<Car, String>("name") {
public String getValue(Car car) { return car.name.toLowerCase(); }
};
There are slight differences in performance between the two. The first one will
be fastest in all cases, but will use more memory (i.e. storing 2 versions of
the string). The second one will be fast if you build an index on the
attribute, AND the index gets used to answer your queries. If the index doesn't
get used for some queries (i.e. it isn't suitable for some query, or CQEngine
thinks another index will be faster), then once CQEngine has built a candidate
set from other indexes, it will use this attribute to filter results and will
end up converting the name to lowercase at runtime. If memory isn't an issue
then the first option is fastest.
It's not realistic for indexes themselves to support case-insensitive queries.
The letters 'A' and 'a' are represented by different bytes, so navigating
indexes in a case-insensitive manner would degrade performance. The easiest
solution is usually to just build the index on either lowercase or uppercase
versions of strings, and then convert the string in the query to lowercase or
uppercase accordingly. As you have done :)
However, it might be possible to enhance attributes to flag them as being
case-insensitive. That way, if CQEngine encountered a query on a
case-insensitive attribute, it could automatically convert the query string to
lowercase. Then you wouldn't need to think about it when writing queries. I'll
think about this and probably bundle it in with the changes per the
null-handling discussion. Thanks!
Original comment by ni...@npgall.com
on 25 Oct 2012 at 11:36
Original comment by ni...@npgall.com
on 29 Oct 2012 at 9:55
I'm shelving this idea for the time being.
This would be a useful feature, but I'm not sure about the cost-benefit of
implementing it. I'm only about 40% in favour and 60% against this idea right
now though, so if others would like it added, please add an "I want this"
comment to this issue to vote for it.
It's currently fairly easy to have case-insensitive retrieval, using the
approach above, so this really is a nice-to-have feature.
Implementing this feature, would probably require adding two new types of
attribute: SimpleCaseInsensitiveAttribute and
MultiValueCaseInsensitiveAttribute. I'd consider any patches to add the feature
and I'm happy to answer questions from anyone who really wants to implement it.
"If in doubt, leave it out" is the motto for the time being.
Original comment by ni...@npgall.com
on 18 Nov 2012 at 10:35
Original issue reported on code.google.com by
SylentPr...@gmail.com
on 25 Oct 2012 at 3:14