PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
16.38k stars 1.59k forks source link

/flows/paginate returns optional flow_ids #15609

Open aaazzam opened 4 weeks ago

aaazzam commented 4 weeks ago

Bug summary

I expect a request to/flows/paginate to return a (potentially empty) array of Flows.Flow inherits from IDBaseModel which uses a default_factory to assign UUIDs on write, so id is not a required field. Indeed, the OpenAPI spec correctly reports that id is optional, even though id should be "required" on read. This makes client autogeneration in other languages tedious, since your reads could potentially return Flow objects with no ids, which makes no sense.

Background:

The signature of this route is:

@router.post("/paginate")
async def paginate_flows(
    limit: int = dependencies.LimitBody(),
    page: int = Body(1, ge=1),
    flows: Optional[schemas.filters.FlowFilter] = None,
    flow_runs: Optional[schemas.filters.FlowRunFilter] = None,
    task_runs: Optional[schemas.filters.TaskRunFilter] = None,
    deployments: Optional[schemas.filters.DeploymentFilter] = None,
    work_pools: Optional[schemas.filters.WorkPoolFilter] = None,
    sort: schemas.sorting.FlowSort = Body(schemas.sorting.FlowSort.NAME_ASC),
    db: PrefectDBInterface = Depends(provide_database_interface),
) -> FlowPaginationResponse:

Here FlowPaginationResponse is a model with a field, results, that is an array of prefect.server.schemas.core.Flow that inherits from IDBaseModel and ORMBaseModel.

class IDBaseModel(PrefectBaseModel):
    """
    A PrefectBaseModel with an auto-generated UUID ID value.

    The ID is reset on copy() and not included in equality comparisons.
    """

    _reset_fields: ClassVar[Set[str]] = {"id"}
    id: UUID = Field(default_factory=uuid4)

class ORMBaseModel(IDBaseModel):
    """
    A PrefectBaseModel with an auto-generated UUID ID value and created /
    updated timestamps, intended for compatibility with our standard ORM models.

    The ID, created, and updated fields are reset on copy() and not included in
    equality comparisons.
    """

    _reset_fields: ClassVar[Set[str]] = {"id", "created", "updated"}

    model_config = ConfigDict(from_attributes=True)

    created: Optional[DateTime] = Field(default=None, repr=False)
    updated: Optional[DateTime] = Field(default=None, repr=False)

class Flow(ORMBaseModel):
    """An ORM representation of flow data."""

    name: Name = Field(
        default=..., description="The name of the flow", examples=["my-flow"]
    )
    tags: List[str] = Field(
        default_factory=list,
        description="A list of flow tags",
        examples=[["tag-1", "tag-2"]],
    )

Since

Version info (prefect version output)

Version:             3.0.3+70.g70db71f84e.dirty
API version:         0.8.4
Python version:      3.12.4
Git commit:          70db71f8
Built:               Mon, Oct 7, 2024 11:23 PM
OS/Arch:             darwin/arm64
Profile:             default
Server type:         server
Pydantic version:    2.8.2

Additional context

No response

aaazzam commented 4 weeks ago

On reflection this isn't a hard problem, e.g.:


class Flow(ORMBaseModel):
    """An ORM representation of flow data."""

    model_config = ConfigDict(json_schema_extra={"required": ["id", "name"]})
    name: Name = Field(
        default=..., description="The name of the flow", examples=["my-flow"]
    )
    tags: List[str] = Field(
        default_factory=list,
        description="A list of flow tags",
        examples=[["tag-1", "tag-2"]],
    )

fixes it.