fractal-analytics-platform / fractal-server

Fractal backend
https://fractal-analytics-platform.github.io/fractal-server/
BSD 3-Clause "New" or "Revised" License
8 stars 3 forks source link

Remove `TaskCollectStatusV2` schema? #1593

Closed tcompa closed 1 day ago

tcompa commented 3 days ago

The relevant db model reads

class CollectionStateV2(SQLModel, table=True):

    id: Optional[int] = Field(default=None, primary_key=True)
    data: dict[str, Any] = Field(sa_column=Column(JSON), default={})
    timestamp: datetime = Field(
        default_factory=get_timestamp,
        sa_column=Column(DateTime(timezone=True)),
    )

where data is an unstructured JSON folder. The same holds for the API response model StateRead:


class _StateBase(BaseModel):
    """
    Base class for `State`.

    Attributes:
        id: Primary key
        data: Content of the state
        timestamp: Time stamp of the state
    """

    data: dict[str, Any]
    timestamp: datetime

class StateRead(_StateBase):
    """
    Class for `State` read from database.

    Attributes:
        id:
    """

    id: Optional[int]

    _timestamp = validator("timestamp", allow_reuse=True)(valutc("timestamp"))

However we add an additional typing via TaskCollectStatusV2, which is used here:

$ git grep TaskCollectStatusV2 fractal_server

fractal_server/app/routes/api/v2/task_collection.py:from ....schemas.v2 import TaskCollectStatusV2
fractal_server/app/routes/api/v2/task_collection.py:    collection_status = TaskCollectStatusV2(
fractal_server/app/routes/api/v2/task_collection.py:    data = TaskCollectStatusV2(**state.data)
fractal_server/app/schemas/v2/__init__.py:from .task_collection import TaskCollectStatusV2  # noqa F401
fractal_server/app/schemas/v2/task_collection.py:class TaskCollectStatusV2(BaseModel):
fractal_server/tasks/v2/get_collection_data.py:from fractal_server.app.schemas.v2 import TaskCollectStatusV2
fractal_server/tasks/v2/get_collection_data.py:def get_collection_data(venv_path: Path) -> TaskCollectStatusV2:
fractal_server/tasks/v2/get_collection_data.py:    return TaskCollectStatusV2(**data)

I suggest we get rid of this model: we either opt for guaranteed structure (thus extracting the relevant attributes from a JSON column into specific columns) or we accept unstructured JSON values.

(cc @mfranzon)

tcompa commented 3 days ago

Note: motivation for this issue partly comes from generalizing this object to their SSH version, where e.g. venv_path is not relevant any more.