Open Filsommer opened 1 year ago
Thanks for the issue @Filsommer
This can be achieved if we have the user define "models" (in ORM-speak) themselves and then pass them on to us.
Prisma (which is an excellent library by the way) generates the Pydantic models for the user from it's schema, but we have no such schema.
I had tried an implementation of this, but ultimately I did not merge it as it resulted in a lot of code duplication (almost every request builder class had to be copied - one generic version and one that wasn't). This was because I had to set the type of APIResponse.data to List[dict[str, Any]]
in the cases where the user did NOT provide a well-typed model to parse into, but there's no neat way yet in Python to set a default for a type parameter like that. (See PEP 696 which would fix this issue but it's been a draft for a while now 😦 )
I hope that explains this a bit - I think I might just write the old implementation anyways now (it's been a year since the PR I linked) but I'm open to discussion for better ideas!
Edit: a more complete implementation of the PR I linked above - anand2312/postgrest-py#1
Thanks for the insights @anand2312
Could you elaborate what you said with have the user define "models" (in ORM-speak) themselves and then pass them on to us.
?
How could I technically do that? You mean something like this?
data: List[Item] = supabase.table("item").select().execute().data
My linter (Pylance) complains:
Expression of type "List[Dict[str, Any]]" cannot be assigned to declared type "List[Item]"
"List[Dict[str, Any]]" is incompatible with "List[Item]"
TypeVar "_T@list" is invariant
"Dict[str, Any]" is incompatible with "Item"
@Filsommer The library doesn't know until runtime what the type of the object returned by the query is, so we are forced to have the user describe the type (by making a class with all the fields that the table has) and then pass that class to the query, indicating that that model should be used for the return type. This would be similar to how "models" work in popular Python ORMs like SQLAlchemy. Right now the proposed functionality hasn't been actually merged so sadly there's no easy way for you (the user) to get this working. The only thing I can think of is writing a helper function to parse the query result into pydantic models:
from typing import Any, TypeVar, Type
from pydantic import BaseModel
_ModelT = TypeVar("_ModelT", bound=BaseModel)
class Item(BaseModel):
... # fields are defined here
def parse_query(cls: Type[_ModelT], data: list[dict[str, Any]]) -> list[_ModelT]:
return [cls(**row) for row in data]
items = parse_query(Item, supabase.table("item").select().execute().data)
reveal_type(items) # Type of items is list[Item]
Can you check out the PRs I linked and tell which of the APIs I've shown in the examples there look better from a user's perspective? Thanks 😄
@anand2312, I read your PR and really like the direction you're taking.
Some remarks:
r = client.from_model(Country).select("*,cities(*)").execute()
looks like the more elegant solution.
I feel like r = client.from_("countries").select("*,cities(*)").execute(model=Country)
should be avoided as it increases the amount of points of failure. E.g. changing the Table name from "countries" to "country" would result in having to change the argument to from_("country")
and also the __table_name__
class field.
I cloned your PR and experimented with some cool ideas:
Firstly, I was able to remove most of your duplication by adding the Generics to the original request builders when using from_model
, while maintaining backwards compatibility with the from_
and table
methods. The result is a list of dicts instead of class instances for now, but that's fine for now IMO.
Secondly, as a dev I would love to keep my python models in sync with my DB tables. I wrote a small (still very initial stage!) script to parse the schemas and generate models automatically, and it worked surprisingly well.
from jinja2 import Environment, FileSystemLoader
from pathlib import Path
from httpx import get
import json
from typing import Union
def map_type(type: str, format: str, description: Union[None, str], relations: dict, class_name: str) -> str:
# TODO improve this parsing algo
if description and "Foreign Key" in description:
parent_type = description.split("table='")[1].split("'")[0]
relations[parent_type] = {"field_name": class_name, "field_type": f"List[{class_name.capitalize()}]"}
output_type = ""
if type == "integer":
output_type = "int"
elif type == "string" and "timestamp" in format:
output_type = "datetime"
elif type == "string":
output_type = "str"
elif type == "number":
output_type = "float"
elif type == "boolean":
output_type = "bool"
else:
output_type = "None"
return output_type
r = get(f"https://{YOUR_URL}.supabase.co/rest/v1/?apikey={YOUR_KEY}")
definitions = json.loads(r.text)["definitions"]
models = {}
relations = {}
for key, value in definitions.items():
models[key] = []
for property_key, property_value in value.get("properties", {}).items():
models[key].append({"field_name": property_key, "field_type": map_type(property_value.get("type"), property_value.get("format"), property_value.get("description"), relations, key)})
environment = Environment(loader=FileSystemLoader(Path(__file__).parent / "templates"), trim_blocks=True, lstrip_blocks=True)
template = environment.get_template("client.py.jinja")
output = template.render(models=models, relations=relations)
with open(Path(__file__).parent / 'models.py', 'w') as f:
f.write(output)
This would output models.py
which can then be consumed by the from_model
method:
from __future__ import annotations
from typing import Dict, Tuple, Type, TypeVar, Union, cast, Optional, List
from datetime import datetime
from pydantic import BaseModel
class Cities(BaseModel):
"""Represents a cities record"""
id: Optional[int]
created_at: Optional[datetime]
country_id: Optional[int]
city_name: Optional[str]
__table_name__: ClassVar[str] = "cities"
class Countries(BaseModel):
"""Represents a countries record"""
id: Optional[int]
created_at: Optional[datetime]
country_name: Optional[str]
cities: Optional[List[Cities]]
__table_name__: ClassVar[str] = "countries"
Another approach I tried was the prisma way, e.g.:
response = client.countries.select("*,cities(*)").execute().data # Type is List[Countries]
, but still need to tweak some things to make it work. Feels more hacky for now.
Maybe we could chat somewhere (twitter?) to better discuss this later if you want :)
Thanks for the feedback @Filsommer ! Auto-generating the models had been on my mind too, very interesting to see an actual solution 😄 Although I doubt I'd put that in the main library (probably an extension library that works along with supabase-py?)
Are you on Discord? We can chat in the supabase discord server - https://discord.gg/kk7UgwWYen (i'm @anand2312
on Discord)
I'd like to join the discussions too. One thing; I don't think we should go with pydantic
. It's a great library but it's primarily a data validation library. From the Swagger schema from Postgrest we know the types - no need to validate them since the types will be generated from the schema. I suggest using something like attrs
. We could get some inspiration from openapi-python-client
which generates a python client from OpenAPI 3.x.
Feel free to ping me on Discord.
Another option is to generate based on the SQL Schema by connecting directly to the database. Similar to sqlacodegen
.
This issue is stale because it has been open for 365 days with no activity.
This would be really helpful
Note: I'm on the latest version: supabase==1.0.3
Is your feature request related to a problem? Please describe. Related to this issue. However, that issue isn't fixed yet
Describe the solution you'd like When executing
supabase.table("item").select().execute().data
, it is crucial for me and other users to get a return type beyond Any. In this query the return type should be Item, just as it is when using supabase with Typescript.Describe alternatives you've considered I saw Prisma for Python offers this. E.g.
items: List[Item] = await prisma.item.find_many()
works as expected.