simonw / sqlite-utils

Python CLI utility and library for manipulating SQLite databases
https://sqlite-utils.datasette.io
Apache License 2.0
1.63k stars 111 forks source link

Allow making m2m relation of a table to itself #431

Open rafguns opened 2 years ago

rafguns commented 2 years ago

I am building a database, in which one of the tables has a many-to-many relationship to itself. As far as I can see, this is not (yet) possible using .m2m() in sqlite-utils. This may be a bit of a niche use case, so feel free to close this issue if you feel it would introduce too much complexity compared to the benefits.

Example: suppose I have a table of people, and I want to store the information that John and Mary have two children, Michael and Suzy. It would be neat if I could do something like this:

from sqlite_utils import Database

db = Database(memory=True)
db["people"].insert({"name": "John"}, pk="name").m2m(
    "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name"
)
db["people"].insert({"name": "Mary"}, pk="name").m2m(
    "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name"
)

But if I do that, the many-to-many table parent_child has only one column:

CREATE TABLE [parent_child] (
   [people_id] TEXT REFERENCES [people]([name]),
   PRIMARY KEY ([people_id], [people_id])
)

This could be solved by adding one or two keyword_arguments to .m2m(), e.g. .m2m(..., left_name=None, right_name=None) or .m2m(..., names=(None, None)).

rafguns commented 2 years ago

I'd be happy to write a PR for this, if you think it's worth having.

simonw commented 2 years ago

Yeah, I think it would be neat if the library could support self-referential many-to-many in a nice way.

I'm not sure about the left_name/right_name design though. Would it be possible to have this work as the user intends, by spotting that the other table name "people" matches the name of the current table?

db["people"].insert({"name": "Mary"}, pk="name").m2m(
    "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name"
)

The created table could look like this:

CREATE TABLE [parent_child] (
   [people_id_1] TEXT REFERENCES [people]([name]),
   [people_id_2] TEXT REFERENCES [people]([name]),
   PRIMARY KEY ([people_id_1], [people_id_2])
)

I've not thought very hard about this, so the design I'm proposing here might not work.

Are there other reasons people might wan the left_name= and right_name= parameters? If so then I'm much happier with those.

rafguns commented 2 years ago

Yeah, I think I prefer your suggestion: it seems cleaner than my initial left_name=/right_name= idea. Perhaps one downside is that it's less obvious what the role of each field is: in this example, is people_id_1 a reference to parent or child?