ODM2 / ODM2

An information model for spatially-discrete, feature-based earth observations.
Other
58 stars 18 forks source link

ODM2 Schema #101

Closed sreeder closed 9 years ago

sreeder commented 9 years ago

I am working on the ODM2 api and am having some issues with the interoperability between the 4 main database types (MySQL, MSSQL, PostgreSQL and SQLite) because of the named schema. Can we remove the named schema and just use the default?

emiliom commented 9 years ago

What do you mean? The SQL DDL build_schemas script got rid of the custom schema names ages ago (eg, ODM2Core, ODM2Results). We use the single common schema name "odm2" for all tables across all databases that support schemas.

The problem and answer might be different for different RDBMS, but if you're suggesting we get rid of the "odm2" base schema name altogether, that'd be a really bad idea in PostgreSQL. The "default" is the public schema. Using the public schema for odm2 tables makes database management and migration much more difficult and error prone. Maybe you want to access table names w/o having to specify the schema name? On PostgreSQL that can be done by adding the odm2 schema to the search_path in the DDL script, like this:

ALTER DATABASE myodm2db SET search_path TO odm2, public, pg_catalog;

But I don't think it's a good idea to force users to have the odm2 schema in the search_path, just to make the API easier to manage.

horsburgh commented 9 years ago

The issue that we are having is that we would like to be able to use the Python API with Microsoft SQL Server (supports schemas), MySQL (schema and database are the same thing), PostgreSQL (supports schemas), and SQLite (doesn't support schemas). Right now, the big three are working with the single "ODM2" schema, but SQLite is not because it does not support schemas.

Not being a PostgreSQL guy - is there a reason that putting tables in the "public" schema makes database management and migration more difficult and error prone?

emiliom commented 9 years ago

Out of laziness (and b/c this is well written), I'll quote the book PostGIS in Action, 2011:

(p. 39) For production work involving databases with many tables, we strongly urge you to create your own schemas and to organize your tables around these schemas. Not only will you keep your tables in logical units, but you'll find it easier when it comes time to upgrade PostGIS and for performing selective backups and restores.

(p. 179) Use non-public schemas for your data and custom functions It's a good idea to get into the habit of using your own custom schemas for your data instead of throwing everything in the public schema. If you build many specific custom functions, it's better to store them in custom schemas or even create a schema to specifically hold functions. By using named schemas for your own custom functions and data, upgrading to newer PostGIS versions will be much easier, and so will restoring selective data and functions to another database.

This isn't at all just a PostGIS issue, though. It likely applies to many other PostgreSQL extensions as well. Plus, forcing users to put all ODM2 tables in the default schema makes it harder to do selective backups, dump-restores, and version migrations, and forces a particular way of managing database objects. At 134 tables (and associated objects), this isn't a trivial issue. Of course, forcing a single schema with a fixed name (odm2) is also an imposition, but it's one that involves clean containment of all ODM2 objects. Many users will likely have other tables and objects in addition to ODM2 ones in the same database; putting ODM2 in the public schema limits options for database management.

I definitely appreciate and support the goal of having the API operate seamlessly across systems. But that shouldn't come at the cost of imposing low-level choices on users, which may be counter to their own, sound best practices. I can't imagine that there isn't a way to build a switch for SQLite in the API, though I'm sure that'll take more effort to code. And not having used SQLalchemy much, I can't help directly.

emiliom commented 9 years ago

I think @horsburgh said a few weeks ago that a solution had been found to enable the API to cleanly handle SQLite's lack of support for schemas. Just checking; if that's the case, let's close this issue.

horsburgh commented 9 years ago

@sreeder - can you confirm whether or not this issue can be closed?

sreeder commented 9 years ago

@horsburgh Yes, I can confirm that this issue has been fixed, but it has not been tested by anyone but me.

horsburgh commented 9 years ago

@sreeder - I'm going to close this for now. We can open again if needed.