fastapi / sqlmodel

SQL databases in Python, designed for simplicity, compatibility, and robustness.
https://sqlmodel.tiangolo.com/
MIT License
14.71k stars 672 forks source link

Order of columns in the table created does not have 'id' first, despite the order in the SQLModel. Looks like it's prioritising fields with sa_column #542

Open epicwhale opened 1 year ago

epicwhale commented 1 year ago

First Check

Commit to Help

Example Code

from sqlmodel import Field, SQLModel, JSON, Column, Time

class MyTable(SQLModel, table=True):
    id: int | None = Field(default=None, primary_key=True)
    name: str
    type: str
    slug: str = Field(index=True, unique=True)
    resource_data: dict | None = Field(default=None, sa_column=Column(JSON))  # type: ignore

# ... create engine

SQLModel.metadata.create_all(engine)

Description

The CREATE table script generated for the model above ends up putting resource_data as the first column, instead of preserving the natural order of 'id' first

CREATE TABLE mytable (
     resource_data JSON,          <----- why is this the FIRST column created?
     id SERIAL NOT NULL, 
     name VARCHAR NOT NULL, 
     type VARCHAR NOT NULL, 
     slug VARCHAR NOT NULL, 
     PRIMARY KEY (id)
)

This feels unusual when I inspect my postgresql tables in a db tool like pgAdmin.

How do I ensure the table is created with the 'natural' order?

Operating System

macOS

Operating System Details

No response

SQLModel Version

0.0.8

Python Version

3.11.1

Additional Context

No response

H-Plus-Time commented 1 year ago

I've just encountered this problem too - it originates from the way sqlalchemy deals with a Column's _creation_order property (discussed wrt mixin column ordering at stackoverflow.com/sqlalchemy-move-mixin-columns-to-end). Because the Column() call executes first, the auto-generated Column objects always have higher _creation_order values. sa_args and sa_kwargs don't have this problem (sadly you're stuck with types detectable via get_sqlachemy_type if you use those).

My workaround ended up exploiting Column::copy and the __setattr__ override in the sqlmodel metaclass:

from sqlmodel.main import SQLModelMetaclass
from sqlmodel import SQLModel, Column

class ColumnCloningMetaclass(SQLModelMetaclass):
    def __setattr__(cls, name: str, value: Any) -> None:
        if isinstance(value, Column):
            return super().__setattr__(name, value.copy())
        return super().__setattr__(name, value)

class MyTable(SQLModel, table=True, meta=ColumnCloningMetaclass):
    id: int | None = Field(default=None, primary_key=True)
    name: str
    type: str
    slug: str = Field(index=True, unique=True)
    resource_data: dict | None = Field(default=None, sa_column=Column(JSON))

# _creation_order will match order of field declaration/annotation

Works well enough (negligible perf impact since it's at class generation time), though would be obsoleted by #436 .

LastNever commented 1 year ago

I've just encountered this problem too - it originates from the way sqlalchemy deals with a Column's _creation_order property (discussed wrt mixin column ordering at stackoverflow.com/sqlalchemy-move-mixin-columns-to-end). Because the Column() call executes first, the auto-generated Column objects always have higher _creation_order values. sa_args and sa_kwargs don't have this problem (sadly you're stuck with types detectable via get_sqlachemy_type if you use those).

My workaround ended up exploiting Column::copy and the __setattr__ override in the sqlmodel metaclass:

from sqlmodel.main import SQLModelMetaclass
from sqlmodel import SQLModel, Column

class ColumnCloningMetaclass(SQLModelMetaclass):
    def __setattr__(cls, name: str, value: Any) -> None:
        if isinstance(value, Column):
            return super().__setattr__(name, value.copy())
        return super().__setattr__(name, value)

class MyTable(SQLModel, table=True, meta=ColumnCloningMetaclass):
    id: int | None = Field(default=None, primary_key=True)
    name: str
    type: str
    slug: str = Field(index=True, unique=True)
    resource_data: dict | None = Field(default=None, sa_column=Column(JSON))

# _creation_order will match order of field declaration/annotation

Works well enough (negligible perf impact since it's at class generation time), though would be obsoleted by #436 .

i copy your code and run ,but its not work , resource_data in the first ---i'm sorry ,it's worked , need use 'metaclass' in python 3.10

cout commented 6 months ago

Is this still a problem? I cannot reproduce it (sqlmodel 0.0.18, sqlalchemy 2.0.29):

CREATE TABLE mytable (
        id INTEGER NOT NULL AUTO_INCREMENT,
        name VARCHAR(255) NOT NULL,
        type VARCHAR(255) NOT NULL,
        slug VARCHAR(255) NOT NULL,
        resource_data JSON,
        PRIMARY KEY (id)
)
prurph commented 4 months ago

I'm running into a similar problem where mixed-in classes have their columns added first. For example I have a Timestamps class to add timestamp columns (it has a sqlalchemy watcher to automatically touch updated_at, not pictured for simplicity)

class Timestamps:
    created_at: datetime = Field(sa_type=DateTime(), nullable=False)
    updated_at: datetime = Field(sa_type=DateTime(), nullable=False)

class MyModel(SQLModel, Timestamps, table=True)
    id: Optional[int] = Field(default=None, primary_key=True)

This generates the sql (sqlite):

CREATE TABLE mymodel (
    created_at DATETIME NOT NULL,
        updated_at DATETIME NOT NULL,
        id INTEGER NOT NULL,  -- id is below mixed in fields (other fields in MyModel would appear below this)
    PRIMARY KEY (id)
);

I tried using sa_column_kwargs={"sort_order": 1000} and things like this, but only got a warning and no effect. IMO SQLModel should offer some ability to control field ordering in the create statement.

My workaround is the following hack where I pop columns off of the Tables and then stick them back on when creating the tables.

engine = create_engine(sqlite_url)

def create_db_and_tables():
    for table in SQLModel.metadata.sorted_tables:
        try:
            created_at, updated_at = (
                table.columns.get("created_at"),
                table.columns.get("updated_at"),
            )
            table._columns.remove(created_at)
            table._columns.remove(updated_at)
            table._columns.extend((created_at, updated_at))
        except KeyError:
            pass
    SQLModel.metadata.create_all(engine)
ideepu commented 2 months ago

Hello,

I also have the same problem as @prurph. I have also tried using sa_column_kwargs={"sort_order": 10}, I followed this sqlalchemy doc and used mapped_column(sort_order=10) but this didn't work either.

created_at: datetime = Field(
        default=datetime.now(tz=timezone.utc),
        sa_column=mapped_column(name='created_at', type_=DateTime, index=True, sort_order=10),
    )

I see in the sqlalchemy documentation for sort_order https://docs.sqlalchemy.org/en/20/orm/mapping_api.html#sqlalchemy.orm.mapped_column.params.sort_order here, that it is in version 2.0.4 but as of today I only see 2.0.34 in pypi https://pypi.org/project/SQLAlchemy/2.0.34/. So I am assuming this yet to be released or maybe I missed something.

I am running the following versions.

pydantic==2.9.0
SQLAlchemy==2.0.34
sqlmodel==0.0.22
oYASo commented 2 months ago

Hi!

I had the same issue as @prurph, and I solved it by changing the inheritance order.

class Timestamps:
    created_at: datetime = Field(sa_type=DateTime(), nullable=False)
    updated_at: datetime = Field(sa_type=DateTime(), nullable=False)

class IntegerSQLModel(SQLModel):
    id: Optional[int] = Field(default=None, primary_key=True)

class MyModel(Timestamps, IntegerSQLModel, table=True):
    some_file: str

As a result, the following schema is generated:

CREATE TABLE my_model (
        id SERIAL NOT NULL,
        created_at TIMESTAMP WITHOUT TIME ZONE NOT NULL,
        updated_at TIMESTAMP WITHOUT TIME ZONE NOT NULL,
        some_file VARCHAR NOT NULL,
        PRIMARY KEY (id)
)

I often use separate base classes for primary keys in the form of UUID or int, so this solution works fine for me.

tsuga commented 1 month ago

Any update on this? I am on the same boat.

hjqlc commented 1 week ago

@prurph Yes, you missed something, because the sort_order isn't the param of Column but MappedColumn, and the MappedColumn doesn't pass it to Column. In the main.py file of SQLModel source code, it uses Column. If you want it to work, SQLModel should change Column to MappedColumn.

20241122195632