Namespaces in models - Githubissues

arrowd commented 1 year ago

Many ERP systems allow its user to write applications on top of them. Applications may store their stuff in the system's database, which means that each application has its own set of entities. If we're to write such a system using Persistent, how can we handle a case when two unrelated applications create entities with same names?

In other words, if we have two Haskell modules M1 and M2, both of which containing something like

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
User
   name Text
|]

then running

import qualified M1
import qualified M2

insert $ M1.User "User1"
insert $ M2.User "User2"

will place the data into the same table.

Would it make sense to implement some sort of namespacing in Persistent? If yes, what approach should be taken? I'm thinking about taking a fully qualified module name (like, packagename-1.2.3.Database.Persist.Foo), computing a hash based on it and appending it to the table name.

arrowd commented 1 year ago

This issue [1] is highly related, although the solution proposed for it is only a safe belt preventing data mixing for some cases.

[1] https://github.com/yesodweb/persistent/issues/576

parsonsmatt commented 1 year ago

I think the proper approach here is to support multiple database schema: #93

Then a library can use a specific schema to separate the tables it uses. Oof, I do feel bad linking a 10 year old ticket...

A work-around until then is for the library to export the [UnboundEntityDef] that the QuasiQuoter generates:

librarySchema :: [UnboundEntityDef]
librarySchema = [persistLowerCase|

User
    name Text
|]

A client of the API can then call mkPersist to generate the models. But - they also have the ability to map over the UnboundEntityDef, prefixing all of the EntityNameHS with a tag - I think with transformBi (\(EntityNameHS n) -> (EntityNameHS (prefix <> n)) :: UnboundEntityDef -> UnboundEntityDef it shouldn't be too tricky to write.

arrowd commented 1 year ago

Are database schemas a standard concept? Can we expect it to work for most SQL backends (existing and not yet written)?

(prefix <> n)

Generating that prefix is also a problem. How'd we generate an unique one and be sure that it won't interfere with some existing table? The same question goes for schema names. Or we just offload that problem to the user?

parsonsmatt commented 1 year ago

It seems like Postgres and MySQL both have a support for multiple schema, but Sqlite doesn't. Sqlite has a workaroudn with ATTACH, but I'm not sure how to really make that pan out - requires a totally separate database file.

Generating that prefix is also a problem. How'd we generate an unique one and be sure that it won't interfere with some existing table?

I'm OK letting the user configure this mostly transparently, in the same way that pretty much everything else works in the libary. This lets them potentially shoot themselves in the foot, but coming up with a system for generating unique namespacing prefixes feels like more magic than I want to introduce here.

One thing we do try to maintain is that it should be possible to connect to an existing database which persistent doesn't manage. Thus all the sql= stuff. Allowing folks the low-level interface of "Prepend (or append) a string to each table name, uniformly"

arrowd commented 1 year ago

I'm interested in spending my time on this. Can you guide me a bit on how to move this forward?

parsonsmatt commented 1 year ago

I think step 1 is introducing a new field on EntityDef - entitySchema :: Maybe Text - that, if specified, says what namespace or schema the table should belong to.

Backends should account for entitySchema when querying and preparing migrations. In sqlite (and maybe mongo? redis?), possibly we just do maybe entityDbName (\schema -> schema <> "_" <> entityDbName). Mysql and Postgres have a specific way they want to do that.

arrowd commented 1 year ago

SQLite does have 2 predefined schemas even without ATTACH in play: main and temp [1]. Would we want to account for that?

For instance, if the user sets entitySchema = Just "temp", then the table name should be just entityDbName, but will be created under "temp" schema. Otherwise, for example if entitySchema = Just "foo" then the table name should be "foo" <> "_" <> entityDbName and the schema name would be "main". Does that make sense?

[1] https://www.sqlite.org/lang_naming.html

arrowd commented 1 year ago

The comment for connEscapeTableName talks about PostgreSQL schema, but I don't see anything related in the persistent-postgresql code.

arrowd commented 1 year ago

How do we want to specify schema on the QQ side?

yesodweb / persistent

Namespaces in models #1454