my2iu / Jinq

LINQ-style queries for Java 8
Other
660 stars 71 forks source link

Where clause ignores Brackets, ends up generating complicated WHERE condition #60

Open abibell opened 7 years ago

abibell commented 7 years ago

Usecase Users like keyword search so they can find information quickly. When they type John, phone number, city, country or partial they would like to pull the customer record. They don't like to be told they should type the first name in first name textbox, last name in last name textbox, phone is phone textbox and so on. As it lacks the simplicity and usability.

This is functionally correct. The SQL produced can become an exponential explosion. Given the limited schema available in Jinq library I used this Customer to demonstrate the issue. Here is the test case:

@Test
public void testCountWhereHonourBrackets()
{
  String keyword = "U";
  int minDebt = 10000;
   long count = streams.streamAll(em, Customer.class)
         .where(i -> 
         (
                (i.getCountry() == null || i.getCountry().contains(keyword))
                || (i.getName() == null || i.getName().contains(keyword))
                || (i.getDebt() == 0 || i.getDebt()>minDebt) //replace this with another String field
                )
         && i.getSalary()>1000 )
         .count();
   assertEquals("SELECT COUNT(A) FROM Customer A WHERE ((A.country IS NULL OR LOCATE(:param0, A.country) > 0) OR (A.name IS NULL OR LOCATE(:param0, A.name) > 0) OR (A.debt = 0 OR A.debt > :param1)) AND A.salary > 1000", query);
   assertEquals(1, count);
}

Expected WHERE is ((A.country IS NULL OR LOCATE(:param0, A.country) > 0) OR (A.name IS NULL OR LOCATE(:param0, A.name) > 0) OR (A.debt = 0 OR A.debt > :param1)) AND A.salary > 1000

Actual WHERE is A.country IS NULL AND A.salary > 1000 OR A.country IS NOT NULL AND LOCATE(:param0, A.country) > 0 AND A.salary > 1000 OR A.country IS NOT NULL AND NOT LOCATE(:param1, A.country) > 0 AND A.name IS NULL AND A.salary > 1000 OR A.country IS NOT NULL AND NOT LOCATE(:param2, A.country) > 0 AND A.name IS NOT NULL AND LOCATE(:param3, A.name) > 0 AND A.salary > 1000 OR A.country IS NOT NULL AND NOT LOCATE(:param4, A.country) > 0 AND A.name IS NOT NULL AND NOT LOCATE(:param5, A.name) > 0 AND A.debt = 0 AND A.salary > 1000 OR A.country IS NOT NULL AND NOT LOCATE(:param6, A.country) > 0 AND A.name IS NOT NULL AND NOT LOCATE(:param7, A.name) > 0 AND A.debt <> 0 AND A.debt > :param8 AND A.salary > 1000

Case for Exponential Explosion If you need to do a keyword search across 3 nullable fields with a single AND condition, Jinq would produce 22 conditions as opposed to original 7.

There is another issue regarding exception getting thrown if contains() are not first in where clause when chained. I will post the test case later. In short categoryList.contains(i.getCategoryId()) && i.getSalary()>100 works but i.getSalary()>100 && categoryList.contains(i.getCategoryId()) throws an exception. Few team members noticed.

my2iu commented 7 years ago

Yes, certain query patterns can cause an exponential explosion in clauses. I think it's fixable, but I estimate it'll take 3-6 months of research and intensive coding to figure it out, so the Jinq project would require funding to justify that sort of development.

Jinq does contain various optimizations that will allow you to control the expansion of your queries so that you don't get into trouble. In your case, you can use the fact that Jinq can recognize chains of ORs to ensure that your query doesn't explode:

.where(i -> i.getCountry() == null || i.getCountry().contains(keyword) || 
   || i.getName() == null || i.i.getName().contains(keyword)
   || ... )
.where( i -> i.getSalary()>1000 )   // note that the AND is put in a separate lambda

In any case, this many keyword searches will likely be inherently slow unless your database has some sort of very fancy index for doing the keyword searches.

I'll dig deeper into the contains() issue when you post the details on that later.

my2iu commented 7 years ago

I'll label this as a dupe of #17

my2iu commented 7 years ago

Here are some tips for optimizing queries: https://groups.google.com/forum/#!topic/jinq-forum/24UprPciFY4

And there's also the "orUnion() and andIntersect()" section of the Jinq query guide: http://www.jinq.org/docs/queries.html#N67285

abibell commented 7 years ago

That's fair enough @my2iu. I really like Jinq library.

One of my team made a convincing case to approve funding for Jinq, next day I saw a comment.