Open MrEarle opened 1 week ago
I would technically not characterize this as a bug as this was simply never implemented. My guess is that DBRef comes from PyMongo right?
https://github.com/mongodb/mongo-python-driver/blob/master/bson/dbref.py
The thing is, Link
objects are essentially Pydantic models. And when you try to serialize attributes that are private in the model, they are simply omitted
Ref: https://docs.pydantic.dev/latest/concepts/models/#private-model-attributes
So to support your use case, some 'hacking' would be necessary, as you've demonstrated.
Well, Links
are not pydantic models, they are classes that can be serialized by pydantic. The serialization very explicitly only encodes the inner self.ref
's id
and collection
, while ignoring any other attribute it has.
Also, when creating a document with a link, you need to either call to_ref
or build the DBRef yourself. Maybe this is not a bug, but I don't think it is desirable behavior to omit the fields that the DBRef contains.
To further drive de discussion, Beanie showcases Links as a way to filter a document based on the linked document field. In my example, it would be:
Todo.find(Todo.user.email == email, fetch_links=True)
But this performs a lookup before filtering, and cannot take advantages of indices, which requires expensive colscan operations. This get harder for more complex real-world examples.
A way to improve on this, is to have extra fields, like in the embedded document pattern, that can be indexed (IndexModel(user.email)
). That way, we can perform the above query with an index and without the need to fetch links. We can then fetch the links after performing the query, which would not involve a colscan just to filter the Todo
collection.
But the current implementation does not support this.
I see. That may be very useful but would probably complicate the design or overall logic. If I'm understanding this correctly, we can have any additional fields under the $ref field, which then somehow need to stay in sync from wherever they are originally defined, is this the expectation? (I'm thinking about when this embedded "Link" data needs to be updated) If so, this seems like much work.
But I do agree that at least this possibility to add extra fields is currently missing, and your proposed code in Additional context
seems like a way to go and probably wouldn't break anything if extra fields are added.
Well, it should be the responsibility of the developer to determine when to update the extra fields in the DBRef, as that is part of the business logic of their use cases. The advantages of embedded document pattern are to improve read performance, at the expense of write performance, which is usually rarer. This is a common (and recommended) pattern for MongoDB.
Maybe the solution is not to add this functionality to Links, but have a new type Link-ish feature that represents an embedded document.
Describe the bug When dumping a model with links, the corresponding DBRef dicts don't keep extra attributes.
To Reproduce
Expected behavior A clear and concise description of what you expected to happen.
I would expect that the serialization of the DBRef contains the extra fields it contains. This helps in having an extended reference pattern.
Additional context
Playing around with beanie's implementation, I got it working by modifying the
to_dict
method ofLink
:Edit: This would probably change how to do serialization with Pydantic. For example, in
build_validation
for Pydantic V2,if isinstance(v, dict) and v.keys() == {"id", "collection"}
is used to detect if the dict is a DBRef. Given thatkeys
can now contain more items, the strict equality would no longer hold.