Closed dkbarn closed 4 months ago
The fact that this library requires pre-defining the queries that will be exposed through the Client class seems incredibly limiting.
Is this a practical observation from a project you've build with the Codegen?
It's true that Ariadne Codegen limits your client only to queries you have pre-defined for it, but its a trade off we've did based on prior experience in consuming GraphQL APIs from Python services, where most often than not following were true:
You can call this approach reductio at RESTum but observation that gains from GraphQL's flexibility are limited when using it in service 2 service communication are nothing new. I've did first multi-service project that used GraphQL before Ariadne even existed (in first half of 2018) and we've quickly ended with set of functions that did few pre-defined queries because that was actually better developer experience than specifying the fields every time I've wanted to pull the data from other API.
This is sort of thing you see with ORM's. You can pull a subset of columns from a table, but most of time everything is better. We could probably support generic queries like this (an escape hatch like how it is in ORMs) but I am worried this is a lot of extra work that developers would eventually walk away from as they discover that their business logic is easier to write with a finite and limited number of models.
Is this a practical observation from a project you've build with the Codegen?
We haven't actually built anything with Codegen yet, we are evaluating it as a possible solution, alongside alternatives like https://github.com/graphql-python/gql and https://github.com/profusion/sgqlc
However, our particular scenario is that we have a backend server utilizing GraphQL, and we have many different users accessing this API, for a variety of different purposes (populating UI's, generating reports, fetching data to sync into other systems). We don't have control over the specific details of the queries they could be executing.
Our clients performed limited number of GraphQL operations. Regenerating clients as new queries are included was friction-less.
The only use case in which I could envision a friction-less workflow for responding to changing query needs is the scenario where the backend developers and frontend developers are the same people -- that is, the developers using the Client library and the developers building the GraphQL schema are either one team or collaborate extremely closely.
So maybe this works in service-to-service communication, like you describe. But if you are simply wanting to provide a Pythonic client library that wraps the HTTP mechanics of communicating with a GraphQL server in a generic and extensible way, ariadne-codegen seems inadequate. The design changes I was suggesting were aimed at making this library cater to a wider range of scenarios.
In the end, we may end up having to go with a library like https://github.com/graphql-python/gql since this does allow dynamic construction of queries, but we will be sacrificing the ability to have full mypy typechecking, which I would consider to be a great loss.
@dkbarn In our experience using ariadne-codegen
so far, we've incorporated parts the out-of-the-box AsyncClient
into our own GraphqlClient
code. Dynamic query composition as featured in python-graphql
is also an attractive feature to us, so we're trying to strike a balance between the convenience of code generation and client extensibility by specifying base_client_name
and base_client_path
in our config TOML for ariadne-codegen
to use our client instead of the default AsyncClient
, which gives us the ability to incorporate features from python-graphql
down the road. What we do really like about the default AsyncClient
is that it is compatible with any requests
-style API.
API proposal:
result = await client.query(
Query.user(id="123").fields(
User.id,
User.email,
User.group.fields(UserGroup.id),
User.reviews(category="123"),
),
Query.user(id="fsadsa").alias("wohoo").fields(
User.id,
User.email,
User.group.fields(UserGroup.id),
User.reviews(category="123"),
),
Query.search(query="test").on(
"User",
User.id,
User.email,
).on(
"Thread",
Thread.id,
Thread.title,
),
operation_name="HelloWorld"
)
Query
, User
, UserGroup
or Thread
types would be generated by the codegen. We could have this feature behind feature flag, so if you want to generate those types and custom query support, you would have to opt in to it first.
Just a caveat tho: those generated types would be used only as safety net for query building. I don't think we would be able to generate types for client.query
return values without having Mypy plugin, which is different discussion IMHO.
Yes this looks like what we're after! Regarding your caveat about return types, would the type annotation have to be "Any" then?
Also, I'm not sure I completely understand this bit:
Query.search(query="test").on(
"User",
User.id,
User.email,
).on(
"Thread",
Thread.id,
Thread.title,
)
What is "search" and "on" in this context? What does that translate to in GraphQL?
Regarding your caveat about return types, would the type annotation have to be "Any" then?
I think so, but I am not familiar with what Mypy plugins would make possible.
Above would produce the following GraphQL query:
query HelloWorld {
user(id: "123") {
id
email
group {
id
}
reviews(category: "123")
}
wohoo: user(id: "123") {
id
email
group {
id
}
reviews(category: "123")
}
search(query: "test") {
.. on User {
id
email
}
.. on Thread {
id
title
}
}
}
In your example code above, would the User
class used in the query builder be the same User
class returned in the response, or would you need to define 2 different User
classes to represent these concepts?
Assuming that the User
class used in the query builder is represented as a Pydantic class to maintain consistency with the rest of ariadne-codegen, I'm unclear on how you would achieve the ability to define a fields
method on any given attribute, or even to reference these class-level attributes?
class User(pydantic.BaseModel):
id: int
email: str
Query.user(id="123").fields(
User.id,
User.email,
)
# this would throw
# AttributeError: type object 'User' has no attribute 'id'
I doubt those ORM classes would be pydantic models, or that they would return anything else than lists of dicts.
Just a caveat tho: those generated types would be used only as safety net for query building. I don't think we would be able to generate types for client.query return values without having Mypy plugin, which is different discussion IMHO.
Even if we went with classes, those would be separate TypeData
models where every field is nullable, eg:
class UserData(ariadne_codegen.orm):
id: str | None = None
username: str | None = None
group: UserGroupData | None = None
...but seeing how we already require Pydantic, I don't think there would be an issue with having those data types as it's models instances as you've originally proposed.
My only worry here is that the code will have to be littered with if result.some_type.some_field
checks to keep the Mypy happy. 🤔
Bumping this as we use Strawberry for covering our graphql needs but we are trying to move away from OpenAPI Generator and want to use something that can generate async graphql clients instead. We've hit a roadblock here as we have a lot of queries that we couldn't possibly manually define into a queries.graphql
file and so any way to extract this from the schema definition sounds like a life saver :D
@nevoodoo I'm working on a solution and will have it ready soon.
@DamianCzajkowski we're eager to see what sort of solution you've come up with! Roughly when do you think there might be a pull request to look at?
@dkbarn For now it looks like this:
async with SaleorClient(saleor_url=saleor_url, api_key=None) as saleor_client:
query_str = [
Query.products(first=1, channel="default-channel").fields(
ProductCountableConnectionFields.edges().fields(
ProductCountableEdgeFields.node().fields(
ProductFields.id,
ProductFields.slug,
),
ProductCountableEdgeFields.cursor,
),
),
Query.products(first=10, channel="default-channel")
.alias("my_products")
.fields(
ProductCountableConnectionFields.edges().fields(
ProductCountableEdgeFields.node().fields(
ProductFields.id,
)
),
),
]
result = await saleor_client.query(
*query_str,
operation_name="getProducts",
)
Right now I'm struggling with response types and typing on them. I'm testing my solution on really complicated schema from Saleor and I see performance issues with Pydantic models, import takes about 1.8 seconds for 81 models in one file (they are all dependent of themselves). I'm thinking of using dataclasses (0.057s import) but it doesn't have fields validation. I was thinking to use manually generated field validation (as simple as possible). What do you think guys? what will be the best solution for you right now, to have typed responses? or should I abandon this idea in first version of query builder?
Hi, I've published experimental version of custom query builder in ariadne-codegen. It is a pre-release from dev branch. https://github.com/mirumee/ariadne-codegen/releases/tag/0.14.0.dev1 This code needs some refactor and code optimization. I will be cleaning this code in the future but I need to take a break from it for a while. It will be perfect if you guys test it and provide me some feedback about this feature so I can improve it. Thanks!
@DamianCzajkowski
I'd like to do some testing of this. I installed the dev version but the client I generated looks the same as the one generated with stable version of ariadne-codegen
. I initially thought all these Query, ProductCountableConnectionFields, etc.
classes will be generated from the schema. Is that not the case?
It also seems all of the stuff is in the tests
package. Could you provide some initial info on how to test it with my own schema? Do I need to run the codegen differently? Or is it just a functionality for now that is not (or perhaps will never be?) auto generated?
@jukiewiczm
Hi, I should, leave instructions in the README.md file. I'll do it later but if you look into the tests: https://github.com/mirumee/ariadne-codegen/blob/dev/tests/main/clients/custom_query_builder/pyproject.toml
in pyproject.toml there is a flag that you need to initialize: enable_custom_operations = true
This will activate the custom operations generation in your project.
@DamianCzajkowski
I've done some testing and here's my input. First of all, great piece of work! I actually believe this feature will actually become the most exciting one in the project.
"include_all_inputs": False
/"include_all_enums": False
- to be adjusted or made mutually exclusive with "enable_custom_operations": True
"ariadne_codegen.contrib.no_reimports.NoReimportsPlugin"
- as abovecustom_queries.py
seems to be missing enums import, I've temporarily fixed it with from .enums import *
custom_queries.py
missing custom scalars import, in my case that required a from uuid import UUID
import"include_all_inputs": True
/"include_all_enums": True
(dozens of models), so I guess the problem is more general rather than related to this particular functionality (though I'm not sure). I guess some work with NoReimportsPlugin
/ClientForwardRefsPlugin
could help with that.GraphQLArgument._convert_value
so it works with my schema. It's mostly about nested pydantic models (model_dump
won't work in such scenario, you need a level-by-level decomposition while model_dumps
converts everything to dictionary all at once) and custom scalars. The query looks more or less like this:
client.query(Query.items(where=ItemsExp(id=ComparisonExp(eq=150))).fields(ItemsFields.id, ItemsFields.name), operation_name="op")
That's the schema I'm given, so I refrained from assessing whether this is a bad schema example, I guess stuff like this will appear and should work though.
Here are the adjustments
def _convert_value(
self, value: Any
) -> Union[StringValueNode, IntValueNode, FloatValueNode, BooleanValueNode, ObjectValueNode]:
if isinstance(value, str):
return StringValueNode(value=value)
if isinstance(value, int):
return IntValueNode(value=str(value))
if isinstance(value, float):
return FloatValueNode(value=str(value))
if isinstance(value, bool):
return BooleanValueNode(value=value)
if isinstance(value, BaseModel):
fields = [
ObjectFieldNode(name=NameNode(value=field_info.alias or field_name), value=self._convert_value(field_value))
for field_name, field_info in value.model_fields.items()
if (field_value := getattr(value, field_name)) is not None
]
return ObjectValueNode(fields=fields)
if isinstance(value, UUID): # Guess we'd need something more sophisticated for all custom scalar types
return StringValueNode(value=str(value))
raise TypeError(f"Unsupported argument type: {type(value)}")
Regarding response types you mentioned in the other comment, do you mean the type for client.query
? I was wondering how did you approach that? Selected fields will be dynamic so obviously not all of them will always be available. Pydantic models with all the fields being optional would be a way to go. Kind of a weird one, as you wouldn't be able to tell whether the field is null or was just not selected, but I guess it is an option.
For the time being, I guess something like that could work (simplified example):
# in the client class file
from typing import TypeVar
from pydantic import BaseModel
# in the client class
async def structured_query(
self, *fields: "GraphQLField", operation_name: str, output_type: type[OUTPUT_TYPE]
) -> OUTPUT_TYPE:
return output_type.parse_obj(await self.query(*fields, operation_name=operation_name))
and used like this
from pydantic import BaseModel
# These you define on your own, I can't really imagine a way around it besides the wierd solution I mentioned above
class Item(BaseModel):
id: int
name: str
class Items(BaseModel):
items: list[Item]
client.query(Query.items(where=ItemsExp(id=ComparisonExp(eq=150))).fields(ItemsFields.id, ItemsFields.name), operation_name="op", output_type=Items)
Hi @jukiewiczm, Thanks for Your feedback! Can you provide me somehow schema file that you are working on? It will be helpful on fixing those issues. Currently I was testing this all on Saleor schema.
Unfortunately I can't do that (due to obvious reasons - work stuff). Over the weekend I'll try to cut out and anonymize part of it and share it with you. That should be enough to cover all these use cases besides the slow loading, which you're experiencing yourself so I guess it should be enough :)
Optionally, I'll try to generate stuff that doesn't work using the Saleor schema, looking through it, it might also be possible.
@DamianCzajkowski
I forked the repo and used the Saleor schema to reproduce stuff I mentioned. Please take a look at stuff in manual-operations-examples
branch.
Here's the folder that contains it all.
You can review the commits I added to see the progress of fixing stuff for query in this file that initially did not work.
It doesn't reproduce:
Hope this helps!
Hi guys, sorry there was no activity in this issue. But now I committed some code changes that should fix lack of imports in files. I also changed the way we build the query so it's integrated with main ariadne-codegen code in Client.execute(). I want to take care on slow loading the package on import (this is the pydantic issue) and with really big schema it will take some time, but I believe that if i take care of init.py file and reduce the amount of imports there the package itself will be quicker.
Note: If you want to use custom scalars with ariadne-codegen there is a feature that handles this but it requires manual work example
Future improvements:
The fact that this library requires pre-defining the queries that will be exposed through the Client class seems incredibly limiting. The power of GraphQL is its flexibility -- by default, the entirety of a data model is retrievable through the edges on the root Query type. But this power and flexibility is lost when using this library, because the Client class only allows fetching the data and the fields that have been baked into a finite list of canned queries. In a sense, the end result is much more like interacting with an old-school REST API than a flexible GraphQL API.
Certain design decisions made in this library are very useful -- the use of pydantic, the auto-generation of Python dataclasses and enums from GraphQL types and enums, the support for full mypy typechecking. But what would make this library far more useful is if it exposed the full GraphQL schema, by allowing the user to fully specify the return fields of a query, rather than restricting them to a selection of pre-defined queries.
For example, given this schema:
The following classes would be auto-generated:
Queries could then be executed via the Client class like this, with support for fully specifying the return fields:
(Note: the exact syntax used to specify the return fields here would not actually work, but the point is there is some way of specifying the return fields)
The key differences are:
Optional
to allow for these sparsely-populated objects.