Open hstadler opened 4 years ago
@hstadler, I'm not able to reproduce the race-condition. Could you please provide a more complete example?
Nevermind. @hstadler, race-condition in your case if easily fixable by changing nested = fields.Nested(TestNestSchema())
to nested = fields.Nested(TestNestSchema)
.
Indeed, using a class rather than an instance should be safe. But it has the limitation that it doesn't allow the user to parametrize the nested schema with modifiers by instantiating it in the schema definition.
When using an instance, it seems the context is not replaced but updated.
def test_context_thread_safe(self):
class A(Schema):
a = fields.Integer()
class B(Schema):
n = fields.Nested(A()) # <-- instance
sb = B()
sb.context = {"test_1": 1}
sb.dump({"n": {"a": 1}})
sb2 = B()
sb2.context = {"test_2": 1}
sb2.dump({"n": {"a": 1}})
In the latter case, the context is {"test_1": 1, "test_2": 1}
.
I don't see the reason for the update, here. I'd change this to
if isinstance(nested, SchemaABC):
self._schema = copy.copy(nested)
self._schema.context = context
I think this qualifies as a bug and could be fixed by the change above. IIUC, it only affects people using nested schema instances and contexts with different keys, such that the update differs from the assignment. (E.g. passing, {"user_id": xxx}
should be safe because the next context overrides the old one.)
@sloria @deckar01 WDYT? Bug? CVE?
I'm proposing a change for marshmallow 4 that would solve this in a better way : #1826.
@lafrech Thanks for investigating. I think your proposed fix is correct. Do you have time to run with the fix/release?
I think the update in the original code is meant to allow a parent to update the nested schema context (as in #1791).
The fix I propose breaks this. I don't see any safe way to allow updating while being thread-safe but this might be a concern.
I'm developing a web application with Flask, having many marhsmallow schema classes with nested schemes. I also make use of marshmallows context. Sometimes (e.g. first run of the application after a reboot) I noticed unexpected serialization results. It turned out that the context in one of the nested schemas differed from it's parent.
A simplified example, which shows the race condition for concurrent requests with a nested scheme:
When troubleshooting, I also took a brief look at the marhsmallow code and got the feeling that it wasn't just the context that might cause such problems. Is there a way to use marhsmallow thread-safe?