Closed simonw closed 2 years ago
Jeff suggested db[table].(..., alter=True)
for this.
db["urls"].create(
{
"url": str,
"crawled": bool,
"body": str,
"headers": dict,
"status": int,
"status_text": str,
},
pk="url",
defaults={"crawled": False},
if_not_exists=True,
alter=True,
)
I'm not crazy about having to pass both alter=True
and if_not_exists=True
- maybe alter
should imply if_not_exists
.
Should passing alter=True
also drop any columns that aren't included in the new table structure?
It could even spot column types that aren't correct and fix those.
Is that consistent with the expectations set by how alter=True
works elsewhere?
Could call it ensure=True
here if it works differently enough from alter=True
that the behavior could be confusing.
Thinking about this more, I think alter=True
is a good name for this option even if it does more than the same option on .insert()
.
Maybe there should be a separate table.alter(...)
method that does the actual work here, with .create(..., alter=True)
as syntactic sugar for triggering that if the table exists already.
... but that's what the table.transform(...)
method does already!
So maybe this is actually a transform=True
parameter to create()
that triggers table.transform(...)
if necessary.
Should passing
alter=True
also drop any columns that aren't included in the new table structure?It could even spot column types that aren't correct and fix those.
Is that consistent with the expectations set by how
alter=True
works elsewhere?
I would lean towards not dropping them (or making a drop=True
or drop_columns=True
or drop_missing_columns=True
) to work with existing tables easier.
I do like that sqlite-utils mostly just works with existing tables but it's also nice to add to existing fields in a few cases.
Initial prototype:
diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py
index 18a442a..03fd345 100644
--- a/sqlite_utils/db.py
+++ b/sqlite_utils/db.py
@@ -875,6 +875,7 @@ class Database:
hash_id_columns: Optional[Iterable[str]] = None,
extracts: Optional[Union[Dict[str, str], List[str]]] = None,
if_not_exists: bool = False,
+ transform: bool = False,
) -> "Table":
"""
Create a table with the specified name and the specified ``{column_name: type}`` columns.
@@ -892,7 +893,39 @@ class Database:
:param hash_id_columns: List of columns to be used when calculating the hash ID for a row
:param extracts: List or dictionary of columns to be extracted during inserts, see :ref:`python_api_extracts`
:param if_not_exists: Use ``CREATE TABLE IF NOT EXISTS``
- """
+ :param transform: If table already exists, transform it to fit the specified schema
+ """
+ # Transform table to match the new definition if table already exists:
+ if transform and self[name].exists():
+ # First add missing columns and columns to drop
+ existing_columns = self[name].columns_dict
+ missing_columns = dict(
+ (col_name, col_type)
+ for col_name, col_type in columns.items()
+ if col_name not in existing_columns
+ )
+ columns_to_drop = [
+ column for column in existing_columns if column not in columns
+ ]
+ if missing_columns:
+ for col_name, col_type in missing_columns.items():
+ self[name].add_column(col_name, col_type)
+ # Do we need to reset the column order?
+ column_order = None
+ if list(existing_columns) != list(columns):
+ column_order = list(columns)
+ # Only run .transform() if there is something to do
+ # TODO: this misses changes like pk= without also column changes
+ if columns_to_drop or missing_columns or column_order:
+ self[name].transform(
+ types=columns,
+ drop=columns_to_drop,
+ column_order=column_order,
+ not_null=not_null,
+ defaults=defaults,
+ pk=pk,
+ )
+ return cast(Table, self[name])
sql = self.create_table_sql(
name=name,
columns=columns,
@@ -1477,6 +1510,7 @@ class Table(Queryable):
hash_id_columns: Optional[Iterable[str]] = None,
extracts: Optional[Union[Dict[str, str], List[str]]] = None,
if_not_exists: bool = False,
+ transform: bool = False,
) -> "Table":
"""
Create a table with the specified columns.
@@ -1508,6 +1542,7 @@ class Table(Queryable):
hash_id_columns=hash_id_columns,
extracts=extracts,
if_not_exists=if_not_exists,
+ transform=transform,
)
return self
Needs more thought about how things like just a change to pk=
should work.
Example of that prototype working:
>>> from sqlite_utils import Database
>>> db = Database(memory=True)
>>> db["dogs"].create({"id": int, "name": str}, pk="id")
<Table dogs (id, name)>
>>> db["dogs"].create({"id": int, "name": str, "age": int}, pk="id", transform=True)
<Table dogs (id, name, age)>
Also needs comprehensive tests and documentation.
I could add a --transform
option to sqlite-utils create-table
too.
Suggested by @jefftriplett on Discord: https://discord.com/channels/823971286308356157/997738192360964156/1011655389063958600