opensafely-core / job-server

A server for mediating jobs that can be run in an OpenSAFELY secure environment. q.v. job-runner
https://jobs.opensafely.org
Other
5 stars 10 forks source link

Solve the problem of upgrading to Pydantic v2 #4557

Open StevenMaude opened 4 weeks ago

StevenMaude commented 4 weeks ago

In #3427, it was intended to move to Pydantic v2, but that was dropped.

The Pydantic version policy states:

Active development of V1 has already stopped, however critical bug fixes and security vulnerabilities will be fixed in V1 for one year after the release of V2 (June 30, 2024).

We're past that date now, so Pydantic v1 is also presumably unsupported.

From #3427, this may impact several other projects we maintain, other than job-server. It's non-trivial to fix:

we can't currently use pydantic v2 with our openasfely-cli because we need pure-python only dependencies.


Also, it's likely that Pydantic v1 may not be fixed to support Python 3.13 and up.

lucyb commented 2 weeks ago

I thought this might be okay now as we no longer use the opensafely cli within Job Server. However, we do use the Pipeline library, which is used by the cli, so we still have this problem. You're right though, we should find some way of resolving the problem.

bloodearnest commented 2 weeks ago

Yeah, the v2 upgrade became a bit of a mess. Pipeline is a very central library. It is a direct dependency of job-server, job-runner, opensafely-cli, airlock, and probably others.

FTR, the reason we need a pure-python dependency is that currently, we vendor all deps in opensafely-cli, so it can be easily pip installed as a single package with no dependencies. With pydantic v1, this was fine, as there was pure python version. With pydantic v2, there's a required arch dependent dynamic library written in rust. We could solve this by either a) finding a way to vendor arch dependent things in opensafely-cli or b) allowing opensafely-cli to have dependency on pydantic, and not vendor it.

But also, pipeline having such an awkward/heavy dependency in pydantic is not ideal, given it is so core. So much so that we may want to consider reworking pipeline to use something else for validation. Pydantic seemed like a good option, and I still not sure what else would do. Maybe we need to roll our own pure python thing?