Chaining FilterExpressions

developer992 commented 3 weeks ago

Hello,

Trying to follow this PR: https://github.com/SAP/python-pyodata/pull/113

love this change, very good implementation but am having some problems chaining AND and OR operators together.

item = {
    "id": 1,
    "name": 'item name',
    "description": "foo-bar",
    "city": "1000",
}

# Query should look like this:
term="foo"      # search by string term on two fields - OR on name and description
city=1000       # AND search only those with city = 1000

get_entities().filter(Q(name__contains="foo") | Q(description__contains="foo"), city="1000")

# It does this:
{'$filter': ["city eq '1000' and (substringof('foo', name) eq true) or (substringof('foo', description) eq true)"]}

But it needs another ( ) around substring filters to fetch correct items.

Is this possible using these filters?

Many thanks!

developer992 commented 3 weeks ago

wait, I GOT IT

this works:

get_entities().filter(Q(name__contains="foo") | Q(description__contains="foo"), Q(city="1000"))

Not sure if it's a bug or just documentation wrong.

Anyway, thanks!

Edit: No, problem persist

developer992 commented 3 weeks ago

Actually, no

I think this is a bug:

The problem is in this function:

def _combine_expressions(self, expressions):
    return ' and '.join(expressions)

This function gets called several times during the process and the problem is only in the last call when it calls it with all expressions.

I've improved it like so:

Basically it wraps expressions in "( )"

    def _combine_expressions(self, expressions):
        if len(expressions) > 1:
            combined = ""
            for counter, expr in enumerate(expressions):
                combined += f"{' and ' if counter > 0 else ''}({expr})"

            return combined

        return expressions[0]

it then calls the service with such url:

/v2/service/odata-service/odata-table?%24top=100&%24skip=0&%24filter=%28plant+eq+%271000%27%29+and+%28h5+eq+%279%27+or+h5+eq+%27VSA113211%27+or+h5+eq+%27EVL110022%27%29+and+%28mr+eq+%27N40%27%29&%24inlinecount=allpages

which looks like so when parsed:

(plant eq '1000') and (h5 eq '9' or h5 eq 'VSA113211' or h5 eq 'EVL110022') and (mr eq 'N40')

which seems to return the correct items.

I hope i haven't broke anything else though.

phanak-sap commented 2 weeks ago

Hi @developer992 - seems you put lot of time and effort into this.

I must say I am not that much familiar with the part of the code from the PR #113. Could you modify the comment to a pull request with new tests, reproducing your problem? Then the fix from part "I've improved it like so:" would be in the same PR and either all passes or maybe "you broke anything else" :)

Is the problem ONLY about "chaining AND and OR operators " or it would be reproducible by chaining any combinations of filter expressions in this ORM style?

Also, I see it is practically all there in the comment, but I would like you to take credit for the bugfix in git history :)

developer992 commented 2 weeks ago

ok cool :) i will try to run test and create a proper PR, thanks!

yes i think the problem is only when combining AND and OR filters, consider this simple example, if you wanted to chain 3 or 4 filters

filter1 eq 'foo' and filter2 eq true or filter3 eq 'bar' filter1 eq 'foo' and ( filter2 eq true or filter3 eq 'bar' )

The first one will take items with filter3 eq 'bar' OR items with filter1 and filter2 which is different set of items than 2nd example, because it will take items with filter1 AND one of filter2 or filter3

but we'll know more after running the tests :)

phanak-sap commented 2 weeks ago

you can refer to the https://github.com/SAP/python-pyodata/blob/master/CONTRIBUTING.md document

developer992 commented 2 days ago

Hej thanks .. i haven't forgotten, was just busy with project, am still planing on preparing the PR :)

Have some trouble understanding a few things tho, not sure where to ask, perhaps just here?

i wanted to write a test on master branch code which would prove the existence of said bug. Then, after fixing the code, i'd write another test or improve existing one which would show the fix actually works.

But i don't quite understand these existing tests. For instance, responses library records/caches responses so the subsequent test runs would run against cached responses, correct? Am still navigating through their documentation to clear things up a bit, how to register a new remote url and re-cache the responses...

and 2nd thing, take this test for instance, tests/test_service_v2.py::test_count_with_chainable_filter_multiple

it filters some Employees and verifies the count of returned items. The metadata that the tests use, is this tests/metadata.xml but how does it know there are in fact 3 employees that fit the criteria ? In fact, pretty much all tests verify that the count == 3 but i can't find if this is true or just hardcoded value?

i figured, if i wanted to write a test that would prove filtering is wrong, then i need to know for certain how many items there are and that way, write filters that would return the wrong count, but if its just hardcoded value, then this becomes hard to prove on these test services.

phanak-sap commented 2 days ago

Hi @developer992

understandable, no rush. I would probably somewhere in the future dig into that myself, but also did not have enough time yet :)
responses library provide IMHO just mock HTTP response per test, as used in the reference module by @responses.activate. You can just copy-paste the test_count_with_chainable_filter_multiple test with new name and modify the responses.add section. In your case is the problem how the URL should look like, correct?
not sure about the caching between subsequent test runs, you got me here, my understanding is that they are idempotent both each test and each test run. But it should not be anyway a problem if there is stored the cached mock.. since the mocked HTTP response (URL, payload) should be always same in multiple test runs, when you modify the code of the service. And if you modify code of the mock, it should invalidate the cache anyway, if there is any usage of such caching mechanism.
- the question about count of employees - yes it is hardcoded. The fixture with metadata provide the structure of the odata service. But there is no actual real odata service with database that would provide data for specific request. So this is part of the responses mock, in your example test in line https://github.com/SAP/python-pyodata/blob/b6223d1a88b85918fb0dbc9757767c880c84bcfe/tests/test_service_v2.py#L2545. The focus is no on the data itself (it could be anything, we do not have database), but if the URL is correct for the pythonic employees.filter(ID=23, NickName="Steve").count() and if relevant pyodata fields are filled from the mock odata/http response as expected.
- in your bug of wrong filter chaining, the point is primary that the URL should be as you expected, not as (currently) pyodata generates. So if you copy and modify the test for the example you provided here in comment, it should fail because nothing will try reach the expected mock URL. "how many items there are" is not the point here. It would be basically problem of database content, which we ignore in this case entirely. We do not actually care if there are 3 or 5 or 100 "Steves". That would be actually test that would be covering the specific Odata service backend (are for correct URL returned expected data). We are focusing with pyodata solely on the client side (is for this python code generated by the library expected HTTP request and parsed correctly mocked response).

phanak-sap commented 2 days ago

different point so different comment "i wanted to write a test on master branch code which would prove the existence of said bug. Then, after fixing the code, i'd write another test or improve existing one which would show the fix actually works."

Not sure if I understand this. But you should create a branch in your fork, create a new test(s) that will cover the scenario you described in the comment. Existing tests are not covering this obviously.

When you run pytest, it will fail, since there is your bug in pyodata. You fix the bug in the service.py module, so your test(s) for filterExpression chaining will start passing. You commit the tests and fix together in same branch and create one PR.

SAP / python-pyodata

Chaining FilterExpressions #280