redis / redis-om-python

Object mapping, and more, for Redis and Python
MIT License
1.11k stars 112 forks source link

Indexing boolean fields breaks something #510

Closed ilyatovbin-pp closed 5 months ago

ilyatovbin-pp commented 1 year ago
class Something(JsonModel):    
    name: str
    should_do_something: bool = Field(index=True, default=True)

Migrator().run()

Something(name="test").save()

result:

Something(pk='01H0HYP3C4YFF1MR6JV9MEZSF4', name='test', should_do_something=True)

executing a basic get by pk works fine.

Something.get('01H0HYP3C4YFF1MR6JV9MEZSF4') 

result:

Something(pk='01H0HYP3C4YFF1MR6JV9MEZSF4', name='test', should_do_something=True)

when running find().all(), the result is empty:

In [22]: Something.find().all()
Out [22]: []

If I set the bool value to Index=False, the indexing starts working fine:

In [24]: class Something(JsonModel):
    ...:     class Meta:
    ...:         database = get_redis_instance()
    ...: 
    ...:     name: str
    ...:     should_do_something: bool = Field(index=False, default=True)
    ...: 

In [25]: Migrator().run()

In [26]: Something.find().all()
Out[26]: 
[Something(pk='01H0HYP3C4YFF1MR6JV9MEZSF4', name='test', should_do_something=True),
 Something(pk='01H0HYGDBJF50GFRYK9MSPA8DQ', name='test 2', should_do_something=True),
 Something(pk='01H0HYB0MKSMV4QWJ7BM5G2DGH', name='test', should_do_something=True)]
jrwagz commented 1 year ago

I can confirm this as well, that the JSONModel doesn't support indexing boolean fields. Here is my testcase that shows the same thing:

"""This file demonstrates that redis_om_python cannot index/search boolean fields"""
from redis_om import Field, JsonModel, Migrator

class MyObjectBroken(JsonModel):
    my_string: str = Field(index=True)
    my_int: int = Field(index=True)
    my_bool: bool = Field(index=True)

class MyObjectWorks(JsonModel):
    my_string: str = Field(index=True)
    my_int: int = Field(index=True)

Migrator().run()

# Add some objects that are broken
object1 = MyObjectBroken(my_string="", my_int=1, my_bool=False)
object1.save()
object2 = MyObjectBroken(my_string="", my_int=1, my_bool=False)
object2.save()

# Add some objects that work
object3 = MyObjectWorks(my_string="", my_int=1)
object3.save()
object4 = MyObjectWorks(my_string="", my_int=1)
object4.save()

objects = MyObjectBroken.find().all()
print(f"Found {len(objects)} MyObjectBroken expected 2")

object_get = MyObjectBroken.get(object1.pk)
print(f"Found {object_get=} MyObjectBroken")

objects = MyObjectWorks.find().all()
print(f"Found {len(objects)} MyObjectWorks expected 2")

object_get = MyObjectWorks.get(object3.pk)
print(f"Found {object_get=} MyObjectWorks")

If you run this, you'll get the following output showing the object with a boolean field doesn't work:

$ python boolean_field_search_suppport.py
Found 0 MyObjectBroken expected 2
Found object_get=MyObjectBroken(pk='01H5K85HDADT687Y25Y7BAEWKK', my_string='', my_int=1, my_bool=False) MyObjectBroken
Found 2 MyObjectWorks expected 2
Found object_get=MyObjectWorks(pk='01H5K85HDCCAS49NADBDY39CXQ', my_string='', my_int=1) MyObjectWorks
jrwagz commented 1 year ago

See https://github.com/redis/redis-om-python/issues/193 for where it's defined that booleans aren't supported yet, but seem to be supported at some point?

jrwagz commented 1 year ago

From my sleuthing, since the bool class is implemented as a subclass of the integer class, we end up getting a NUMERIC index created:

"ft.create" ":main.MyObjectBroken:index" "ON" "JSON" "PREFIX" "1" ":main.MyObjectBroken:" "SCHEMA" "$.pk" "AS" "pk" "TAG" "SEPARATOR" "|" "$.my_string" "AS" "my_string" "TAG" "SEPARATOR" "|" "$.my_int" "AS" "my_int" "NUMERIC" "$.my_bool" "AS" "my_bool" "NUMERIC"

This is coming from this line: https://github.com/redis/redis-om-python/blob/d1f6959be3e7c19c544c769acca29312cb977504/aredis_om/model/model.py#L1890

From the docs of FT.CREATE there isn't native support for boolean types, so it likely has to be mapped to NUMERIC for this to work, however it's unclear what's going on under the hood here, but I'm guessing that at least a part of it is related to saving actual RedisJSON boolean fields, but telling RediSearch that it's a numeric field.

jrwagz commented 1 year ago

I found a workaround for this issue. Since the bool class is a subclass of the int class, you can declare the model field as int and then convert between bool/int as necessary.

Here is the code that I tested on redis-om 0.1.3:

"""This file demonstrates that redis_om_python cannot index/search boolean fields"""
from redis_om import Field, JsonModel, Migrator

class MyObjectBroken(JsonModel):
    my_string: str = Field(index=True)
    my_int: int = Field(index=True)
    my_bool: int = Field(index=True) # declare as int, but pass in boolean values works

class MyObjectWorks(JsonModel):
    my_string: str = Field(index=True)
    my_int: int = Field(index=True)

Migrator().run()

# Add some objects that are broken
object1 = MyObjectBroken(my_string="", my_int=1, my_bool=False)
object1.save()
object2 = MyObjectBroken(my_string="", my_int=1, my_bool=True)
object2.save()

# Add some objects that work
object3 = MyObjectWorks(my_string="", my_int=1)
object3.save()
object4 = MyObjectWorks(my_string="", my_int=1)
object4.save()

objects = MyObjectBroken.find().all()
print(f"Found {len(objects)} MyObjectBroken expected 2")

objects = MyObjectBroken.find(MyObjectBroken.my_bool == int(False)).all()
print(f"Found {len(objects)} MyObjectBroken with bool as False expected 1")

object_get = MyObjectBroken.get(object1.pk)
print(f"Found {object_get=} MyObjectBroken")

objects = MyObjectWorks.find().all()
print(f"Found {len(objects)} MyObjectWorks expected 2")

object_get = MyObjectWorks.get(object3.pk)
print(f"Found {object_get=} MyObjectWorks")

Here is the output from it:

$ python boolean_field_search_suppport.py
Found 2 MyObjectBroken expected 2
Found 1 MyObjectBroken with bool as False expected 1
Found object_get=MyObjectBroken(pk='01H5KBCJMMCY9Z9SY74C87B20N', my_string='', my_int=1, my_bool=0) MyObjectBroken
Found 2 MyObjectWorks expected 2
Found object_get=MyObjectWorks(pk='01H5KBCJMP4RM26QSSVB77YKTQ', my_string='', my_int=1) MyObjectWorks