surrealdb / surrealdb.py

SurrealDB SDK for Python
https://surrealdb.com
Apache License 2.0
181 stars 53 forks source link

Feature: Using `pydantic` models as parameters. #55

Open Ce11an opened 1 year ago

Ce11an commented 1 year ago

Is your feature request related to a problem?

Using the signin method is difficult on the Surreal websockets class. Different key and value pairs are needed for each sign in type. For example, root sign in requires a user and pass, whereas scope sign in requires a user, pass, database, namespace, scope, and any extra parameters. Using an untyped dictionary does not inform the developer what is required to sign in.

Describe the solution

To be more inline with the Rust client, we can create "credential" models using pydantic. Depending on the type of credentials the user needs, they will know exactly what parameters are needed.

For example:

from typing import Union

from pydantic import BaseModel, Field

class Root(BaseModel):
    """User credentials for the root user."""

    username: str = Field(
        ..., alias="user", title="Username", description="Username for the root user"
    )
    password: str = Field(
        ..., alias="pass", title="Password", description="Password for the root user"
    )

    class Config:
        """Config for the Root class."""

        allow_population_by_field_name = True

class Namespace(BaseModel):
    """User credentials for a namespace."""

    namespace: str = Field(
        alias="NS", title="Namespace", description="Namespace for the user"
    )
    username: str = Field(
        alias="user", title="Username", description="Username for the user"
    )
    password: str = Field(
        alias="pass", title="Password", description="Password for the user"
    )

    class Config:
        """Config for the Namespace class."""

        allow_population_by_field_name = True

class Database(BaseModel):
    """User credentials for a database."""

    namespace: str = Field(
        ..., alias="ns", title="Namespace", description="Namespace for the user"
    )
    database: str = Field(
        ..., alias="db", title="Database", description="Database for the user"
    )
    username: str = Field(
        ..., alias="user", title="Username", description="Username for the user"
    )
    password: str = Field(
        ..., alias="pass", title="Password", description="Password for the user"
    )

    class Config:
        """Config for the Database class."""

        allow_population_by_field_name = True

class Scope(BaseModel):
    """User credentials for a scope."""

    namespace: str = Field(
        ..., alias="ns", title="Namespace", description="Namespace for the user"
    )
    database: str = Field(
        ..., alias="db", title="Database", description="Database for the user"
    )
    scope: str = Field(..., alias="sc", title="Scope", description="Scope for the user")
    params: BaseModel = Field(
        ..., alias="params", title="Parameters", description="Parameters for the scope"
    )

    class Config:
        """Config for the Scope class."""

        arbitrary_types_allowed = True
        allow_population_by_field_name = True

Credentials = Union[Root, Namespace, Database, Scope]

The signin method would on the Surreal class would look something like:

    async def signin(self, credentials: Credentials) -> str:
        # noinspection PyShadowingNames
        """Signs this connection in to a specific authentication scope.

        Args:
            credentials: Variables used in a signin query.

        Returns:
            The token used for authenticating future requests.

        Examples:
            >>> from pydantic import BaseModel

            >>> from surrealdb.auth import Root

            >>> async with Surreal("ws://127.0.0.1:8000/rpc") as db:
            >>>     await db.signin(Root(username="root", password="root")),
        """
        response = await self._send_receive(
            Request(
                id=generate_uuid(),
                method="signin",
                params=(credentials.dict(by_alias=True),),
            ),
        )
        success: ResponseSuccess = _validate_response(
            response, SurrealAuthenticationException
        )
        token: str = success.result
        self.token = token
        return self.token

Using the method:

    async with Surreal("ws://127.0.0.1:8000/rpc") as db:
        await db.signin(Root(username="root", password="root"))

Alternative methods

Potentially could use a TypedDict to replicate this behaviour instead. However, pydantic will allow us to create stricter types overall.

SurrealDB version

surreal 1.0.0-beta.8+20220930.c246533 for macos on x86_64

surrealdb.py version

surrealdb.py 0.30 for macOS on aarch64 using Python 3.9.1

Contact Details

hallcellan@gmail.com

Is there an existing issue for this?

Code of Conduct

AlexFrid commented 1 year ago

Generally seems like a good idea 👍 Let's keep this open for a bit to see if there are any other opinions.

bilinenkisi commented 1 year ago

Using dataclasses instead of pydantic in a project can make it less dependent on third-party packages for data validation and settings management. For example, in my own Python client for SurrealDB called SurrealPy, I am using dataclasses instead of pydantic. This not only simplifies the project but also makes it more robust by reducing the number of external dependencies. You can check out the code on this link: https://github.com/Vulnware/SurrealPy/blob/Dev3.9/surrealpy/ws/models.py

import dataclasses
from typing import Any, Optional, Union

__all__ = ("SurrealRequest","SurrealResponse","LoginParams")

@dataclasses.dataclass(frozen=True)
class LoginParams:
    """
    A class used to represent SurrealDB login parameters.
    ...

    Attributes
    ----------
    username: str
        The username of the user. alias of field is user
    password: str
        The password of the user. alias of field is pass

    Methods
    -------
    to_dict() -> dict[str, Any]
        Convert the SurrealDB login parameters to a dict with alias as keys
    """

    username: str = dataclasses.field(metadata={"alias": "user"})
    password: str = dataclasses.field(metadata={"alias": "pass"})

    def __post_init__(self):
        if not self.username:
            raise ValueError("username is required")
        if not self.password:
            raise ValueError("password is required")

    def to_dict(self) -> dict[str, Any]:
        return {
            dataclasses.fields(self)[i].metadata["alias"]: getattr(
                self, dataclasses.fields(self)[i].name
            )
            for i in range(len(dataclasses.fields(self)))
        }

@dataclasses.dataclass(frozen=True)
class SurrealRequest:
    id: str
    method: str
    params: tuple[Any]

@dataclasses.dataclass(frozen=True)
class SurrealResponse:
    result: Union[dict, list[Any]]
    id: Optional[str] = None
Ce11an commented 1 year ago

Ah, good suggestion. I opted for pydantic as there is less boilerplate to write and there is more standardisation for error handling (not to mention it's now it is written in Rust!). I agree that using dataclasses is a viable option as long as their tested throughly 👍🏻

bilinenkisi commented 1 year ago

Ah, good suggestion. I opted for pydantic as there is less boilerplate to write and there is more standardisation for error handling (not to mention it's now it is written in Rust!). I agree that using dataclasses is a viable option as long as their tested throughly 👍🏻

Yeah, you're right about it. But in my opinion, the official package of surrealDB for Python must be light-weight and use less dependency.

AlexFrid commented 1 year ago

Ah, good suggestion. I opted for pydantic as there is less boilerplate to write and there is more standardisation for error handling (not to mention it's now it is written in Rust!). I agree that using dataclasses is a viable option as long as their tested throughly 👍🏻

Yeah, you're right about it. But in my opinion, the official package of surrealDB for Python must be light-weight and use less dependency.

Thanks for the suggestion @bilinenkisi, we are definitely aiming for it to be light-weight and use as few dependencies as practical. Currently, the only external dependencies we have in the code are pydantic and websockets for the WS client and httpx for the HTTP client.

In your opinion do you think that is too much? 🤔

My thinking so far has been that pydantic offers a better developer experience than dataclasses and like @Ce11an said offers more standardisation for error handling. Would be interested to hear more about what you think about this.

bilinenkisi commented 1 year ago

Ah, good suggestion. I opted for pydantic as there is less boilerplate to write and there is more standardisation for error handling (not to mention it's now it is written in Rust!). I agree that using dataclasses is a viable option as long as their tested throughly 👍🏻

Yeah, you're right about it. But in my opinion, the official package of surrealDB for Python must be light-weight and use less dependency.

Thanks for the suggestion @bilinenkisi, we are definitely aiming for it to be light-weight and use as few dependencies as practical. Currently, the only external dependencies we have in the code are pydantic and websockets for the WS client and httpx for the HTTP client.

In your opinion do you think that is too much? 🤔

My thinking so far has been that pydantic offers a better developer experience than dataclasses and like @Ce11an said offers more standardisation for error handling. Would be interested to hear more about what you think about this.

While pydantic may offer a more user-friendly experience compared to dataclasses, it's worth noting that its core is built with Rust (As @Ce11an mentioned before). As you may already know, relying on external resources in Python can lead to problems down the line, such as memory leaks. Additionally, using Python libraries that require C-binding can be a hassle, especially on Windows, where C source compiling tools like Visual Studio 2017 are often necessary. Lastly, it's important to keep in mind that pydantic may not work on all platforms, such as the arm64 platform, which is becoming more popular as cloud-based server services start to adopt it. In contrast, dataclasses are more likely to work across platforms. While using pydantic may not be too burdensome, I believe writing additional code for testing and checking in dataclasses would be more appropriate. This would provide us with greater control over the package and ensure its reliability.

Ce11an commented 1 year ago

I seem to have made a mistake. The current version of pydantic is still built in Python, the upcoming release V2 will be built in Rust. Saying this, I believe pydantic does not depend on C-bindings for its current release. I have used it on Windows, MacOS, and, Ubuntu with no issues. However, I do agree that tests need to be written for this client. GitHub actions can be set up to build the package on each OS before release. I further agree that objects built with pydantic should be well tested, too.

I think the developer experience is great. It also used if other popular packages like FastAPI and spacy, which shows its stability and usefulness. However, there should be caution when upgrading to V2. I am happy either way - just wanted to correct my past comment 👍🏻

bilinenkisi commented 1 year ago

I seem to have made a mistake. The current version of pydantic is still built in Python, the upcoming release V2 will be built in Rust. Saying this, I believe pydantic does not depend on C-bindings for its current release. I have used it on Windows, MacOS, and, Ubuntu with no issues. However, I do agree that tests need to be written for this client. GitHub actions can be set up to build the package on each OS before release. I further agree that objects built with pydantic should be well tested, too.

I think the developer experience is great. It also used if other popular packages like FastAPI and spacy, which shows its stability and usefulness. However, there should be caution when upgrading to V2. I am happy either way - just wanted to correct my past comment 👍🏻

Yes, I also found out after a search the new core will be Rust based. And as you mentioned many popular packages use pydantic.In conclusion, I opted for pydantic version also 👍. @Ce11an @AlexFrid. But probably, I will keep using dataclasses instead of pydantic on my own version of the client. But if you need any help on the subject, I'm glad to help.

bilinenkisi commented 1 year ago

there is only one question (in my opinion) is needed to answer, will we keep using the original json of python or json function of pydantic package? My suggestion is using pydantic version instead of json package if we intend to use pydantic instead of dataclasses

Ce11an commented 1 year ago

there is only one question (in my opinion) is needed to answer, will we keep using the original json of python or json function of pydantic package? My suggestion is using pydantic version instead of json package if we intend to use pydantic instead of dataclasses

I am not sure what you mean, sorry. Do you have an example?

bilinenkisi commented 1 year ago

there is only one question (in my opinion) is needed to answer, will we keep using the original json of python or json function of pydantic package? My suggestion is using pydantic version instead of json package if we intend to use pydantic instead of dataclasses

I am not sure what you mean, sorry. Do you have an example?

Of course. For example like this, will we be using json.dumps or the function that pydantic provide https://github.com/surrealdb/surrealdb.py/blob/c16b0ef0fff5702728c704f8bf79a05e1a25bf05/surrealdb/ws.py#L718

Ce11an commented 1 year ago

there is only one question (in my opinion) is needed to answer, will we keep using the original json of python or json function of pydantic package? My suggestion is using pydantic version instead of json package if we intend to use pydantic instead of dataclasses

I am not sure what you mean, sorry. Do you have an example?

Of course. For example like this, will we be using json.dumps or the function that pydantic provide

https://github.com/surrealdb/surrealdb.py/blob/c16b0ef0fff5702728c704f8bf79a05e1a25bf05/surrealdb/ws.py#L718

Ah, good spot! Yes, we should make that change if sticking with pydantic.

AlexFrid commented 1 year ago

there is only one question (in my opinion) is needed to answer, will we keep using the original json of python or json function of pydantic package? My suggestion is using pydantic version instead of json package if we intend to use pydantic instead of dataclasses

I am not sure what you mean, sorry. Do you have an example?

Of course. For example like this, will we be using json.dumps or the function that pydantic provide https://github.com/surrealdb/surrealdb.py/blob/c16b0ef0fff5702728c704f8bf79a05e1a25bf05/surrealdb/ws.py#L718

Ah, good spot! Yes, we should make that change if sticking with pydantic.

Yeah, I think it's worth testing out the performance/compatibility of using the pydantic json 🤔

Ce11an commented 1 year ago

I am happy to throw together a PR for this feature with what we have discussed here. Along with some unit tests 😉

bilinenkisi commented 1 year ago

I am happy to throw together a PR for this feature with what we have discussed here. Along with some unit tests 😉

Of course 😆

AlexFrid commented 1 year ago

I am happy to throw together a PR for this feature with what we have discussed here. Along with some unit tests 😉

That would definitely allow us to compare 🙌

gaby commented 52 minutes ago

@AlexFrid @Ce11an @bilinenkisi Any progress on this? Pydantic is becoming the industry standard for defining data models. Pydantic v2 is made with Rust and has been out for a while now