Closed jpbruinsslot closed 2 months ago
Further investigation revealed the following, pydantic can serialize enums to either their value property (in this case strings), or their raw enums.
Initially judging by the received exceptions Expected `enum` but got `str` - serialized value may not be as expected
and (after setting the argument item
of the fastapi handler patch_task
to a Task
model instead of a dict)
Expected `enum` but got `TaskStatus` - serialized value may not be as expected
. I suspected that fastapi, and how it deserializes its objects, was the culprit. This suspicion was enforced by the lack of context of the exception despite the multiple try/catch and logging statements.
A suggestion was made to remove the str
subclassing of the TaskStatus
enum, by @ammar92. However this led to the following problem with sqlalchemy. (usage of str
subclassing is suggested by the pydantic docs)
StatementError("(builtins.LookupError) 'TaskStatus.COMPLETED' is not among the defined enum values. Enum name: taskstatus. Possible values: PENDING, QUEUED, DISPATCHED, ..., CANCELLED")
In particular when updating a row with sqlalchemy.
@retry()
def update_task(self, task: models.Task) -> None:
with self.dbconn.session.begin() as session:
(
session.query(models.TaskDB)
.filter(models.TaskDB.id == task.id)
.update(task.model_dump())
In an UPDATE statement SQLAlchemy expects a dictionary, but with the value of the enum ({"status": "completed"}
) not the raw value ({"status": 'status': <TaskStatus.COMPLETED: 'completed'>}
).
We can achieve that setting the following configuration for the pydantic model:
class Task(BaseModel):
# Whether to populate models with the value property of enums, rather than
# the raw enum. In this case the value property of the enum is a string.
model_config = ConfigDict(from_attributes=True, use_enum_values=True)
id: uuid.UUID | None = None
scheduler_id: str | None = None
type: str | None = None
p_item: PrioritizedItem | None = None
status: TaskStatus | None = None
This will correctly deserialize the Task
and the enum when calling the model_dump()
method. However this will conflict with the following SQLAlchemy INSERT statement
@retry()
def create_task(self, task: models.Task) -> models.Task | None:
with self.dbconn.session.begin() as session:
task_orm = models.TaskDB(**task.model_dump())
session.add(task_orm)
created_task = models.Task.model_validate(task_orm)
return created_task
That expects the enum to be a raw enum. And will result in the following error:
sqlalchemy.exc.DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input value for enum taskstatus: "queued"
This then forces us to implement a different model_dump()
, specifically for the sqlalchemy update_task()
query that makes sure that the enum values are used. However, this feels somewhat counterintuitive, so I'm also looking into implementing the suggestion made here by @dekkers, in addition to updating fastapi, pydantic.
What eventually solved this problem was to make sure to use model_config = ConfigDict(from_attributes=True, use_enum_values=True)
on both models.Task
and serializers.Task
were set. This populates the models with the value property of enums, rather than the raw enum.
However sqlalchemy didn't like this change since, insert
expects raw enums, and update
expects values of enums. Updating the pydantic model_dump()
to model_dump(mode="json")
specifically for the update()
statement for sqlalchemy solved that.
The following exception is observed on main:
It seems to be happening around the
PATCH
of the status of a task. And I suspect it to be thrown around here:https://github.com/minvws/nl-kat-coordination/blob/5af317b3bb6c7c878a433f1c204f8342ec5f1fa7/mula/scheduler/server/server.py#L440-L473
I reckoned the status could indeed be a string, since a patch payload could be
{"status": "completed"}
. So I implemented the following, to explicitly set the payload to be of the typerequest.Task
in order to correctly deserialize the json payload.https://github.com/minvws/nl-kat-coordination/blob/e93f6f3a7cb2ba77786db23a150104b1286f3f53/mula/scheduler/server/server.py#L449-L490
However on feature branch, the error remains:
This leads me to suspect that the error is thrown at the fastapi level, and a problem of pydantic not being able to deserialize it correctly. Important to note is that the change to the
request.Task
model as the argument to thepatch_task()
function resulted in the following change in the logging of the exception:From:
To: