fractal-analytics-platform / fractal-tasks-core

Main tasks for the Fractal analytics platform
https://fractal-analytics-platform.github.io/fractal-tasks-core/
BSD 3-Clause "New" or "Revised" License
14 stars 6 forks source link

Fully move to Pydantic v2 #793

Closed lorenzocerrone closed 4 months ago

lorenzocerrone commented 4 months ago

Close #790 Close #791 Close #792 Close #796 Close #802

Checklist before merging

lorenzocerrone commented 4 months ago

@tcompa the v2 model conversion is ready. It pass all tests not related to manifest creation. There is still a bit of work to do on my side to update some deprecated APIs like .dict

tcompa commented 4 months ago

This is now the main PR that is meant to be reviewed, polished and eventually merged to fully move from Pydantic V1 to V2 - both in core-library/tasks and in JSON-Schema tools.

Note that releasing these changes won't be strictly useful until we address https://github.com/fractal-analytics-platform/fractal-web/issues/531.

github-actions[bot] commented 4 months ago

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  fractal_tasks_core
  channels.py
  labels.py
  fractal_tasks_core/ngff
  specs.py
  fractal_tasks_core/roi
  v1_checks.py
  fractal_tasks_core/tables
  v1.py
  fractal_tasks_core/tasks
  _zarr_utils.py
  apply_registration_to_image.py
  calculate_registration_image_based.py
  cellpose_segmentation.py
  cellpose_utils.py
  cellvoyager_to_ome_zarr_compute.py
  cellvoyager_to_ome_zarr_init.py
  cellvoyager_to_ome_zarr_init_multiplex.py
  copy_ome_zarr_hcs_plate.py
  find_registration_consensus.py
  illumination_correction.py
  image_based_registration_hcs_init.py
  import_ome_zarr.py
  init_group_by_well_for_multiplexing.py
  io_models.py
  maximum_intensity_projection.py
  napari_workflows_wrapper.py
Project Total  

This report was generated by python-coverage-comment-action

tcompa commented 4 months ago

Here is the current situation for external packages:

fractal-helper-tasks

The only diff in the new manifest is the switch from pydantic_v1 to pydantic_v2. Other than that, the new manifest has not changed.

APx_fractal_task_collection

Manifest generation fails, since the package uses Pydantic-v1 methods and should be updated (ref https://github.com/Apricot-Therapeutics/APx_fractal_task_collection/issues/5).

scMultiplex

Manifest generation fails with ModuleNotFoundError: No module named 'image_registration'. I need to debug this further.

tcompa commented 4 months ago

Manifest generation fails with ModuleNotFoundError: No module named 'image_registration'. I need to debug this further.

This is unrelated to the current PR, see https://github.com/fractal-analytics-platform/fractal-tasks-core/issues/803.

tcompa commented 4 months ago

Latest update about external packages:

  1. fractal-helper-tasks is fully under control (the diff only includes the args_schema_version field)
  2. A recent branch of APX is fully under control (all details in https://github.com/Apricot-Therapeutics/APx_fractal_task_collection/issues/5)
  3. scmultiplex has the diff reported below, which does not seem to include anything breaking
diff --git a/src/scmultiplex/__FRACTAL_MANIFEST__.json b/src/scmultiplex/__FRACTAL_MANIFEST__.json
index 6fd5f0c..0b34938 100644
--- a/src/scmultiplex/__FRACTAL_MANIFEST__.json
+++ b/src/scmultiplex/__FRACTAL_MANIFEST__.json
@@ -1,5 +1,5 @@
 {
-  "args_schema_version": "pydantic_v1",
+  "args_schema_version": "pydantic_v2",
   "has_args_schemas": true,
   "manifest_version": "2",
   "task_list": [
@@ -35,8 +35,7 @@
         "type": "object"
       },
       "args_schema_parallel": {
-        "additionalProperties": false,
-        "definitions": {
+        "$defs": {
           "InitArgsRegistration": {
             "description": "Registration init args.",
             "properties": {
@@ -52,11 +51,12 @@
             "type": "object"
           }
         },
+        "additionalProperties": false,
         "properties": {
           "init_args": {
-            "$ref": "#/definitions/InitArgsRegistration",
+            "$ref": "#/$defs/InitArgsRegistration",
             "description": "Intialization arguments provided by `_image_based_registration_hcs_init`. They contain the reference_zarr_url that is used for registration. (standard argument for Fractal tasks, managed by Fractal server).",
-            "title": "Init_Args"
+            "title": "Init Args"
           },
           "iou_cutoff": {
             "default": 0.2,
@@ -140,8 +140,7 @@
         "type": "object"
       },
     {
       "args_schema_parallel": {
-        "additionalProperties": false,
-        "definitions": {
+        "$defs": {
           "ChannelInputModel": {
             "description": "A channel which is specified by either `wavelength_id` or `label`.",
             "properties": {
               "label": {
-                "description": "Name of the channel.",
+                "description": "Name of the channel. Can only be specified if wavelength_id is not set.",
                 "title": "Label",
                 "type": "string"
               },
               "wavelength_id": {
-                "description": "Unique ID for the channel wavelength, e.g. `A01_C01`.",
+                "description": "Unique ID for the channel wavelength, e.g. `A01_C01`. Can only be specified if label is not set.",
                 "title": "Wavelength Id",
                 "type": "string"
               }
@@ -1311,6 +1310,7 @@
             "type": "object"
           }
         },
+        "additionalProperties": false,
         "properties": {
           "allow_duplicate_labels": {
             "default": false,
@@ -1326,7 +1326,7 @@
           },
           "input_channels": {
             "additionalProperties": {
-              "$ref": "#/definitions/ChannelInputModel"
+              "$ref": "#/$defs/ChannelInputModel"
             },
             "description": "Dictionary of channels to measure. Keys are the names that will be added as prefixes to the measurements, values are another dictionary containing either wavelength_id or channel_label information to allow Fractal to find the correct channel (but not both). Example: {\"C01\": {\"wavelength_id\": \"A01_C01\"}. To only measure morphology, provide an empty dict",
             "title": "Input Channels",
tcompa commented 4 months ago

The last missing check on the produced manifests is the one for fractal-tasks-core, which is also under control (full details at https://github.com/fractal-analytics-platform/fractal-web/issues/531).

tcompa commented 4 months ago

This PR is now mostly ready, and could be merged any time.

Minor reviews can still take place, before or after merging:

  1. With @lorenzocerrone we plan to test the JSON Schema tools on yet another package
  2. Before using the newly-generated schema, a (minor) fix and a new version are required in fractal-web - https://github.com/fractal-analytics-platform/fractal-web/issues/531

Once we merge this PR, it will be a bit cumbersome to switch back to Pydantic V1 (cc @jluethi)

lorenzocerrone commented 4 months ago

I found a minor issue in handling tuples in pydantic v2 that can be potentially problematic.

If I have the following tuple definition in a pydantic model:

patch: tuple[int, int, int] = (80, 160, 160)

The schema generated is slightly different between v2 and v1

V1:

"patch": {
                "title": "Patch",
                "default": [
                  80,
                  160,
                  160
                ],
                "type": "array",
                "minItems": 3,
                "maxItems": 3,
                "items": [
                  {
                    "type": "integer"
                  },
                  {
                    "type": "integer"
                  },
                  {
                    "type": "integer"
                  }
                ],
                "description": "The patch size."
              },

V2:

"patch": {
                ...
                "prefixItems": [
                  {
                    "type": "integer"
                  },
                  {
                    "type": "integer"
                  },
                  {
                    "type": "integer"
                  }
                ]
               ...
              },

The V2 version is not shown in the fractal web UI.

@tcompa, should this be addressed on fractal web or as a custom schema generation?

tcompa commented 4 months ago

I found a minor issue in handling tuples in pydantic v2 that can be potentially problematic.

Great, thanks for catching it!

should this be addressed on fractal web or as a custom schema generation?

This is one of the new features in JSON Schema Draft 2020-12, which pydantic2-generated JSON Schemas should comply with.

Therefore I guess it's best to adapt fractal-web, rather than sticking with some old JSON Schemas.

tcompa commented 4 months ago

Therefore I guess it's best to adapt fractal-web, rather than sticking with some old JSON Schemas.

Ref https://github.com/fractal-analytics-platform/fractal-web/issues/534