redis / redis-om-python

Object mapping, and more, for Redis and Python
MIT License
1.1k stars 108 forks source link

redisearch on indexed JSON object yields inconsistent results #425

Open llastowski opened 1 year ago

llastowski commented 1 year ago

for the dataset of approx 15000 keys, used as network data cache (string), the .find() method returns empty array

NetworkCache.find(NetworkCache.network=="1.1.1.0/24").all()
Out[29]: [] 

yet, the record exists, and can be pulled with .get method, as well as when all records are returned

NetworkCache(pk='01GK9KWM3G2M2V0C47FR1VBEZJ', network='1.1.1.0/24', uid='ipv4_network/Mjo6Mo6ODU2OQ2NDQyOjoxLjEuMS4wLzI0:1.1.1.0_24', ref='85694124')

mpmX commented 1 year ago

Can you post the Model definition? Maybe it's related to #299

sav-norem commented 1 year ago

Can you find the record by the other fields? Just to totally rule out that it's not just a weird thing with the / - aside from that, yea please share the whole model so we can see if it's potentially related to other issues.

llastowski commented 1 year ago

the model has few indexed fields:

class NetworkCache(JsonModel):
    """Network caching class."""

    network: str = RedisField(index=True)

    uid: str = RedisField(index=True)
    ref: str = RedisField(index=True)
    ddi_platform: PlatformR = RedisField(index=True)
    parent_block_ref: str
    network_view: NetworkViewR = RedisField(index=True)

I am able to retrieve them by the primary key, I also tried searching by another attribute, which worked properly:

NetworkCache.find(NetworkCache.ref=='802420516').all()
Out[6]: [NetworkCache(pk='01GK9S1BSSQJ1TCWNQ3VKQ7VQG', network='10.77.120.128/26', uid='ipv4_network/Mjo6Mzo6ODAyNDIwNTE2OjoxMC43Ny4xMjAuMTI4LzI2:10.77.120.128_26/abc', ref='802420516')(...)]

running subsequent query for 'network' string yields empty result

NetworkCache.find(NetworkCache.network=='10.77.120.128/26').all()
Out[7]: []

when I try to search another object type by it's name (the same database), the search by indexed string works fine

NetworkViewR.find(NetworkViewR.name=='external').all()
Out[10]: [NetworkViewR(pk='01GGWDJ4ET67P9HC0H1ZD2HX68', id=2, name='external', uid='uid_string_123', ref='123345', 
tstoco commented 1 year ago

Hi guys,

I am having exactly the same issue.

The same tests performed by @llastowski have been done.

# Python Libraries
from datetime import datetime
from typing import List, Union, Optional

# Third-Party Modules
from redis_om import (
    RedisModel,
    Field,
    Migrator,
)

# Project Modules

from Datamodels.redisom_base_model import BaseModelJson

class ResultItem(BaseModelJson):
    """Describe an item returned by the search scraper.

    This data model is an extension of the JsonModel from the redis_om package.
    It uses pydantic data models and the object supports all pydantic features.
    """

    item_id: Optional[int] = Field(index=True)
    title: str
    price: int = Field(index=True)
    description: Optional[str]
    shipping: Optional[float]
    listing_date: datetime = Field(index=True)
    accepts_offer: Optional[bool]
    search_position: Optional[int] = Field(index=True)
    bid_count: Optional[int]
    reviews_count: Optional[int]
    image_url: str
    url: str = Field(index=True)
╰─>$ pip list
# pip list
Package            Version
------------------ -----------
aioredis           2.0.1
async-timeout      4.0.2
beautifulsoup4     4.11.1
certifi            2022.12.7
charset-normalizer 2.1.1
click              8.1.3
hiredis            2.0.0
idna               3.4
more-itertools     8.14.0
pip                22.3.1
pptree             3.1
pydantic           1.10.2
python-ulid        1.1.0
redis              4.4.0
redis-om           0.1.1
requests           2.28.1
schedule           1.1.0
setuptools         65.5.0
slack-sdk          3.19.5
soupsieve          2.3.2.post1
types-redis        4.3.21.6
typing_extensions  4.4.0
urllib3            1.26.13
wheel              0.38.4

@llastowski have you found a fix for the issue?

Many Thanks.

mpmX commented 1 year ago

What happens when you set index to False for the listing_date and maybe search_position?

llastowski commented 1 year ago

@mpmX (edited) It seems that I have missed the release notes mentioning that forward slash character search issue has been addressed and I was still stuck with the previous release; I have updated the version to (0.1.1) and it works properly now

tstoco commented 1 year ago

What happens when you set index to False for the listing_date and maybe search_position?

It seems that empty results are less frequent. However, it still sometimes returns 0 items in Redis but if I check Redis the items are there.

# Third-Party Modules
from redis_om import get_redis_connection, JsonModel, EmbeddedJsonModel

class BaseModelJson(JsonModel):
    """Redis OM base configuration class. Inherited from the Redis OM JsonModel class.

    Class that define the Redis connection database URL and an empty global_key_prefix.
    """

    class Meta:
        """Redis OM Meta options."""

        database = redis
        global_key_prefix = ""

class ResultItem(BaseModelJson):
    """Describe an item returned by the search scraper.

    This data model is an extension of the JsonModel from the redis_om package.
    It uses pydantic data models and the object supports all pydantic features.
    """

    item_id: Optional[str] = Field(index=True)
    title: str
    price: int 
    description: Optional[str]
    shipping: Optional[float]
    listing_date: datetime 
    accepts_offer: Optional[bool]
    search_position: Optional[int] 
    bid_count: Optional[int]
    reviews_count: Optional[int]
    image_url: str
    url: str

Migrator().run()
print("Items in Redis: ", ResultItem.find().count())

I will try to debug further when I have some spare time.

XChikuX commented 1 year ago

Is this a code running against a single instance of redis or a cluster?

tstoco commented 1 year ago

Is this a code running against a single instance of redis or a cluster?

It is a single instance running as a docker container.