orbisgis / h2gis

A spatial extension of the H2 database.
http://www.h2gis.org
GNU Lesser General Public License v3.0
207 stars 63 forks source link

Cross queries with H2 - H2GIS [discussion] #1187

Open ebocher opened 3 years ago

ebocher commented 3 years ago

H2 database can link table from another database with the CREATE LINKED TABLE command. This command is very useful and we use it a lot ;-). I'd love to have more advanced features as DBLINK or FOREIGN TABLE in PostgreSQL. FOREIGN TABLE commands allows advanced manipulations and therefore needs a hard job. We can imagine a mix between DBLINK logic and actual CREATE LINKED TABLE command.

I propose to add new functions and features :

dblink_connect(dbparameters, connectionName) (or dblink_open)

connectionName -> identifier for the connection, referenced during a H2 connection

dbparameters = same as linked table

Update

CREATE LINKED TABLE LINK('org.h2.Driver', 'jdbc:h2:./test2', 'sa', 'sa', 'TEST');

to support CREATE LINKED TABLE LINK(connectionName, 'TEST');

dblink_execute(connectionName, "CREATE TABLE BLALA...")

dblink_disconnect(connectionName) (or dblink_close)

@nicolas-f @SPalominos @katzyn

katzyn commented 3 years ago

From my point of view the best option is to add support for catalogs where each catalog in our case will represent a separate database (file) and all these files are located on the same system and are opened by the same H2 process. Catalogs are part of the SQL Standard, unlike all this stuff, so their representation and behavior is clearly defined. But they require a lot of work in different areas of H2.

Personally I don't like to add any new features to existing poorly designed commands or introduce new non-standard commands.

I also don't understand why you need to execute commands in one database using another. H2 still doesn't have SQL/PSM, so you can't execute any complex logic directly from SQL, you have to write Java code for it, but when you have Java, you already can work with different connections.

Your example commands will create additional problems with lifecycle of all these objects, because during database initialization these connections obliviously need to be initialized earlier than these tables. They create additional dependencies between tables and them and these dependencies will require their processing in different places. There are too many complications for a very small feature.

You already can link the whole schema with a single call to LINK_SCHEMA function if you need to work with many linked tables. It other cases it shouldn't be very hard to repeat CREATE LINKED TABLE commands; actually you can even build them dynamically in a subquery and execute them all at once with EXECUTE IMMEDIATE.

Linked tables have significant performance, concurrency, and transactional consistency problems, they are designed for special cases when you need to read something from other database, including databases running in other DBMS. I think you expect too much from them.

ebocher commented 3 years ago

From my point of view the best option is to add support for catalogs where each catalog in our case will represent a separate database (file) and all these files are located on the same system and are opened by the same H2 process. Catalogs are part of the SQL Standard, unlike all this stuff, so their representation and behavior is clearly defined. But they require a lot of work in different areas of H2.

I don't understand the connection between the support for catalogs and cross database command, unless you want to isolate external db connexion.

Personally I don't like to add any new features to existing poorly designed commands or introduce new non-standard commands. I also don't understand why you need to execute commands in one database using another. H2 still doesn't have SQL/PSM, so you can't execute any complex logic directly from SQL, you have to write Java code for it, but when you have Java, you already can work with different connections.

Sometimes there is a gap between developer and user ;-) . I don't understand why you talk about SQL/PSM. dblink in postgresql doesn’t need to run procedural function. For basic uses, executing commands in one database using another is not common but when you have a centralized db it's quite useful and common.

Your example commands will create additional problems with lifecycle of all these objects, because during database initialization these connections obliviously need to be initialized earlier than these tables. They create additional dependencies between tables and them and these dependencies will require their processing in different places. There are too many complications for a very small feature.

Sure it will be a bit more complicate and cause several issues. That's why I opened a discussion.

Linked tables have significant performance, concurrency, and transactional consistency problems, they are designed for special cases when you need to read something from other database, including databases running in other DBMS. I think you expect too much from them.

Yes linked tables are poor performance compared to a direct db connection but this command simplifies the user's life. So to conclude the best option seems to add support for catalogs and them develop a wrapper to isolate for example an external postgresql connection in a catalog ?

katzyn commented 3 years ago

Catalogs will be the best option for connections between multiple H2's own databases located on the same host system.

For connections to other DBMS there are no oblivious improvements. Hypothetically queries with multiple linked tables from the same external database can be optimized better, but it will add a lot of complexity in sensitive areas of H2. Actually I'm pretty sure that almost all of our contributors don't ever tried to understand how query optimizer works.

but when you have a centralized db

H2 still doesn't have normal server management operations for own needs. Without this feature any attempts to add some configurations for connections to other DBMS look rather premature.

ebocher commented 3 years ago

Ok thanks . I let this issue open.