chanzuckerberg / cryoet-data-portal

CryoET Data Portal
MIT License
16 stars 9 forks source link

Python client models should be automatically derived from the GraphQL schema #894

Closed andy-sweet closed 1 month ago

andy-sweet commented 3 months ago

The model classes in the Python client are manually maintained in _models.py.

The class and field names are almost entirely defined in the GraphQL schema, which is also in this repository.

Maintaining this manually has created documentation errors in the past, and may lead to bigger issues in the future when the schema changes.

The goal here is to automatically derive these model classes from the GraphQL schema.

Some customization of the classes may be needed to handle functionality like Tomogram.download_mrcfile.

andy-sweet commented 3 months ago

For context, this issue came up during a conversation between me, @justinelarsen, and @jgadling .

Since there are some upcoming expected changes to the schema, it seemed like a good time to fix this.

Some ideas about solutions.

If we decide to use platformics to generate the GraphQL schema in the future, we might consider automatically defining the Python GraphQL client and related data classes earlier since there are likely existing related Python data classes at that point (e.g. from SQLAlchemy, Pydantic, Strawberry).

andy-sweet commented 3 months ago

TLDR: the best first approach is to use graphql-core (which the current gql client depends on, and is the de-facto Python port of GraphQL.js), to parse the schema then write our own simple code generator to selectively generate the classes and fields we need.

I looked at a few options and made some rough notes.

sgqlc

datamodel-code-generator

ariadne-codegen

gql