redis / redis-om-python

Object mapping, and more, for Redis and Python
MIT License
1.06k stars 108 forks source link

Bug: Use of Field(index=True) on `bool` type field breaks Model.find() #618

Closed AzorianMatt closed 1 month ago

AzorianMatt commented 1 month ago

I have found what seems to be a bug that is likely the underlying cause of issue #299.

When using this model:

class User(JsonModel): okta_id: Optional[str] = Field(index=True) username: str = Field(index=True) password: str = Field(index=True) token: Optional[str] = Field(index=True) email: Optional[str] = Field(index=True) full_name: Optional[str] = None enabled: bool = Field(index=True, default=True)

The .find() method does not return any results both with no filters applied and with any combination of valid filters applied. As soon as I change the enabled model property to not be a Field(index=True) but instead just a static True, the .find() method begins behaving as expected.

I have reproduced this issue in both versions 0.2.2 and 0.3.1 from PyPi.

slorello89 commented 1 month ago

Hi @AzorianMatt - so the underlying issue with booleans was (pre 0.3.1) that a boolean would be interpreted in the index creation as a NUMERIC, but serialized to a JSON boolean (which RediSearch is equipped to query as a tag field). This caused the objects to actually break on insertion and not be queryable.

However there's another issue in the way the ExpressionProxy (the lambda source for redis om python) interprets the queries. I can't override the dunder __bool__ method of the expression proxy without shattering the library into a million pieces, consequentially, we unfortunately cannot use unary boolean operations (so you have to do truth evaluations like find(Model.enabled == True) rather than find(Model.enabled) :

let's take a look at this from a test:

@py_test_mark_asyncio
async def test_boolean():
    class Example(JsonModel):
        b: bool = Field(index=True)
        d: datetime.date = Field(index=True)
        name: str = Field(index=True)

    await Migrator().run()

    ex = Example(b=True, name="steve", d=datetime.date.today())
    exFalse = Example(b=False, name="foo", d=datetime.date.today())
    await ex.save()
    await exFalse.save()
    res = await Example.find(Example.b == True).first()
    assert res.name == "steve"

    res = await Example.find(Example.b == False).first()
    assert res.name == "foo"

    res = await Example.find(Example.d == ex.d and Example.b == True).first()
    assert res.name == ex.name

Does this help at all?

AzorianMatt commented 1 month ago

@slorello89 this information is absolutely helpful and concise to boot! So this makes very quick sense of the scenario to which I essentially just stumbled upon. I am a first-time user of this library as of right now, so I'm just trying to make quick use of the obvious basics for prototyping.

So as far as the index creation issue goes, I'm unclear as to whether that is still an issue to date? I assume it is based on the behaviors I described which would seemingly be indicative of records missing from an index (as I can always retrieve by pk). I am currently stuck on 0.2.2 due to a dependency issue, but I read it as the index issue is fixed in 0.3.1?

slorello89 commented 1 month ago

Hi @AzorianMatt - the issue is resolved in 0.3.1 (the version I just released yesterday), if you were having indexing failures you would be able to see the failures in the FT.INFO of the index - steps to check:

  1. Create your index with Migrator().run(), insert your docs
  2. check your index name (FT._LIST will print out all your indexes)
  3. call FT.INFO index-name on the index
  4. Check the hash_indexing_failures field in there - that's what will tell you whether something failed to insert
  5. You can always check to see what kind of documents you have in the index by querying it with FT.SEARCH index-name *
AzorianMatt commented 1 month ago

@slorello89 thank you! This is very helpful information. I will see what I can do in order to get updated to the latest version. For now, I can work around the issue by avoiding indexing of the bool field as I don't actually need it quite yet.

slorello89 commented 1 month ago

Sounds good @AzorianMatt - if this issue does recur in the latest version of Redis OM Python, please by all means reopen.