BeanieODM / beanie

Asynchronous Python ODM for MongoDB
http://beanie-odm.dev/
Apache License 2.0
1.99k stars 211 forks source link

[BUG] Array order is not consistent #414

Closed faruqsandi closed 1 year ago

faruqsandi commented 1 year ago

First of all, thank you for this great library! Describe the bug I have a hierarchy of documents such as a course that will have multiple linked topics in an array. I put functionality in my FastAPI backend app, to change the order of topics. I see that in MongoDB Compass that the array is in the right order. But every time I call the endpoint, the Beanie Document object will return the array, not in the right order.

To Reproduce Here is what the document looks like: image

{
  "_id": {
    "$oid": "637011ebe4a0b602dfa188a7"
  },
  "revision_id": null,
  "author_id": 19,
  "name": "Introduction to programming",
  "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse malesuada lacus ex, sit amet blandit leo lobortis eget Suspendisse malesuada lacus ex, sit amet blandit leo lobortis eget.",
  "image_url": "https://i.imgur.com/ifQ7Lxo.jpeg",
  "topics": [
    {
      "$ref": "Topic",
      "$id": {
        "$oid": "63701201e4a0b602dfa188a8"
      }
    },
    {
      "$ref": "Topic",
      "$id": {
        "$oid": "637011fbe3ae5cc480291a65"
      }
    }
  ]
}

Here is the things needed to get the doc:


class Course(Document):
    author_id: int
    name: str
    description: Optional[str]
    image_url: Optional[str]
    topics: List[Link[Topic]] = []

    class Settings:
        use_cache = False
        cache_expiration_time = datetime.timedelta(seconds=10)
        cache_capacity = 5

def topic_helper(topic: dict) -> dict:
    id = topic.pop('_id', None)
    if id:
        topic['id'] = str(id)
    return topic

async def get_topics(course_id: str) -> list:
    course = await Course.get(course_id, fetch_links=True)  # type: ignore
    topics = []
    if course:
        # await course.fetch_link(Course.topics)
        for topic in course.topics:
            await topic.fetch_all_links()  # type: ignore
            topic = topic.dict()  # type: ignore
            # _ = topic.pop('contents', None)
            topics.append(topic_helper(topic))
    return topics

And here is how I check the result:

import time
import requests
url = 'http://127.0.0.1:8000/course/637011ebe4a0b602dfa188a7/topic'

for i in range(20):
    response = requests.get(url)
    print([x['id'] for x in response.json()['data'][0]])
    time.sleep(0.2)

output:

['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['63701201e4a0b602dfa188a8', '637011fbe3ae5cc480291a65']
['63701201e4a0b602dfa188a8', '637011fbe3ae5cc480291a65']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['63701201e4a0b602dfa188a8', '637011fbe3ae5cc480291a65']
['63701201e4a0b602dfa188a8', '637011fbe3ae5cc480291a65']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']
['637011fbe3ae5cc480291a65', '63701201e4a0b602dfa188a8']

Expected behavior It should return an array in the right order.

Additional context

faruqsandi commented 1 year ago

It seems to_list() won't return a list of objects with an order like ids. https://github.com/roman-right/beanie/blob/2aca1994c3b6c762b26101650fb5879a4907c90e/beanie/odm/fields.py#L179

This query in MongoDB compass:

{ '_id': { $in: [ObjectId('63701201e4a0b602dfa188a8'), ObjectId('637011fbe3ae5cc480291a65')] } }

return things in order. But:

  topics1 = await Topic.find(In(Topic.id, topic_id_list)).to_list()
  topics2 = await Topic.find({'_id': {'$in': topic_id_list}}).to_list()

is not. Sorry I am not good enough to debug things.

roman-right commented 1 year ago

Hi, I think this is how links are resolved. I use aggregations inside (bc normal find can not join collections) with lookup. I bet, it doesn't sort. I'll check if sort by id will not have performance issues there. And if will not - I'll make it default. If it will - I'll think about solutions. Probably optional sort parameters or smth like this.

Thank you for the issue. It is very reasonable.

faruqsandi commented 1 year ago

Thank you for your kind response.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity.

github-actions[bot] commented 1 year ago

This issue was closed because it has been stalled for 14 days with no activity.

moonrail commented 1 year ago

@roman-right I also have use-cases that depend on the order of links and am unable to use fetch_links/fetch_all_links because of its reordering, which is a shame due to the usefulness of this feature.

Other than OP, the order on my setups is always the same when using fetch_*, but not as defined in DB or in Document List[Link[...]] when not resolving links.

As this issue was closed by stale bot, I'd suggest reopening it.

moonrail commented 1 year ago

@roman-right This seems to fix the order in my basic tests: https://github.com/roman-right/beanie/blob/1.18.0b1/beanie/odm/fields.py#L181-L182

The above lines replaced with:

            ids.append(PydanticObjectId(link.ref.id))  # casting to be sure to have a comparable ObjectId, as type-hinting says "Any" for `DBRef.id`
        docs = await model_class.find(In("_id", ids), with_children=True, fetch_links=fetch_links).to_list()  # type: ignore
        docs.sort(key=lambda d: ids.index(d.id))
        return docs

The overhead should be low, but of course, there is more overhead than not sorting anything.