mridoni / gixsql

GixSQL is an ESQL preprocessor and a series of runtime libraries to enable GnuCOBOL to access PostgreSQL, ODBC, MySQL, Oracle and SQLite databases.
GNU General Public License v3.0
14 stars 6 forks source link

The PostgreSQL driver needs START TRANSACTION before using cursors #14

Open mridoni opened 2 years ago

mridoni commented 2 years ago

The test environment is running on our side, currently bails out with a

DB-CODE = 10007
DB-MSG  = ERROR:  DECLARE CURSOR can only be used in transaction blocks

It worked with ocesql but I think there were some parts about transactions - will need to check with someone knowing the COBOL code better on Monday.

_Originally posted by @GitMensch in https://github.com/mridoni/gixsql/issues/74

mridoni commented 2 years ago

Please see mridoni/gixsql#74 for the initial discussion

GitMensch commented 2 years ago

Related: GixSQL currently can't parse

EXEC SQL
   BEGIN
END-EXEC
EXEC SQL
   BEGIN WORK
END-EXEC
EXEC SQL
   BEGIN TRANSACTION
END-EXEC

Which is, according to the PG docs identical to START TRANSACTION. It may be reasonable to make it an alias by translating this in the lexer or adding it to the parser

Side note: I don't know if the optional transaction modes are parsed (when not handled "parsing with a warning" would still be useful).

GitMensch commented 2 years ago

Whenever this is added (no quick change needed for my use-cases!) it would be good to be able to switched on and off. The "ocesql compat" would be an actual compatibility between the different drivers, because those statements seem to only be useful for pgsql (aren't they?) - if you previously targeted a different DB and now use the OCDB interface "as working" it seems to be necessary to toggle it on when switching to pgsql.

Maybe that should be enabled in the connection string.

mridoni commented 2 years ago

This behaviour has been modified in the current version (internal repository). The PostgreSQL driver now uses standard resultsets to fetch data from cursors. The "old" behaviour is still available and can be used with the native_cursors in the connection string (e.g. pgsql://localhost/testdb?native_cursors=on. The "new" behaviour is now the default.

GitMensch commented 2 years ago

Did you made performance curls between both implementations? Is there something to watch out for?

Can I suggest something like GIXSQL_DEFAULT_OPTIONS or similar that are read in and applied before connection string specified options are applied (which therefore could override the former)? Maybe that's something to move out to a new issue, but with this change of existing behavior "kind of related" (allowing to switch it back without adjusting multiple connection strings), too.

mridoni commented 2 years ago

Yes, this could be done quite easily, and changing the default behaviour for cursors before the next releas means just flipping a boolean in the source anyway, if one thinks that the old behaviour should be preserved. Unless there are really bad issues with the new code, I don’t think performance will suffer, due to the way the resultset is stored, it might actually be faster (note: the use of “single row mode” should be investigated here - as an option - for very big resultsets).

By the way: this was done not only to solve this issue, but also for consistency with the other drivers, that I am trying to bring up to speed with the features already present in the PostgreSQL driver (e.g. prepared statements). And a side-effect of this (or rather an experiment that got out of hand) is that a brand-new Oracle driver, using the ODPI library, is currently in the works (feature-complete, but still undergoing testing) and should provide a nice scaffolding for working with BLOBs and such in Oracle, even if the preprocessor part still needs to be written.

mridoni commented 2 years ago

It could also be useful to have a parameter to auto-start transactions when NOT in auto-commit mode, something like:

This would mimick to some extent the behaviour of ESQL programs in a mainframe environment and would avoid having to add START TRANSACTION at the beginning of every program when using native cursors with PostgreSQL.

Note: autocommit mode needs to be re-tested anyway, both for functionality and consistency, it was introduced eons ago and never touched again, basically because it came "for free" with ODBC (this was the first driver I added to my private fork of ocesql). I am sure it worked also with MySQl and PostgreSQL, but a lot changed in the meantime

mridoni commented 2 years ago

I don’t think performance will suffer, due to the way the resultset is stored, it might actually be faster

Well, it is actually slower, but not by much. Over four runs of a program that reads 1.000.000 rows from a cursor, the difference is a bit more than 1%:

1: Execution time (emulated cursors): 18639,7435 ms
1: Execution time (native cursors)  : 19386,4798 ms

2: Execution time (emulated cursors): 25514,4975 ms
2: Execution time (native cursors)  : 22869,1849 ms

3: Execution time (emulated cursors): 25608,1007 ms
3: Execution time (native cursors)  : 26375,6441 ms

4: Execution time (emulated cursors): 23371,6081 ms
4: Execution time (native cursors)  : 23474,283 ms

T: Execution time (emulated cursors): 93133,9498 ms
T: Execution time (native cursors)  : 92105,5918 ms

Note that the test program actually does, nothing except fetching from the cursor and incrementing a counter, so the difference, in real-work programs, is probably even-smaller.

mridoni commented 2 years ago

Yes, this could be done quite easily, and changing the default behaviour for cursors before the next releas means just flipping a boolean in the source anyway, if one thinks that the old behaviour should be preserved.

After some reasoning and tinkering, I decided that the "old" way (native cursors) should be the default one: it's admittedly only a bit faster, but it is the only practical way to support updatable cursors (UPDATE...WHERE CURRENT OF...).

The corresponding support for dynamic cursor emulation in the ODBC driver (when using a PostgreSQL/pgODBC data source) will be removed, since it is not needed anymore.

GitMensch commented 2 years ago

Sounds good to me!

mridoni commented 2 years ago

After some reasoning and tinkering, I decided that the "old" way (native cursors) should be the default one:

Done in the internal repository. Beware: this will make a few current test cases fail since there is no START TRANSACTION anymore in their COBOL code.

I am planning to review both autocommit mode and updatable cursors after the release of v1.0.18 (should be targeted for 1.0.19), this should fix and - hopefully - consolidate the behaviour of the different DB drivers, and - partly - solve this issue. The problem is that different DBMSs have different ideas (or no idea at all) of autocommit and of the transaction handling that obviously goes with it, e.g.: in Oracle DDL statements are not part of a transaction, while in PostgreSQL they are; MySQL is the only one among the supported DBMSs that provides native autocommit support, that otherwise has to be emulated; ODBC system libraries fake it, since they leave the actual autocommit support to the underlying driver (no guarantees there if/when you switch to another DBMS), etc.

This means that it is very difficult to handle this in a consistent way that requires no modifications to the COBOL source code or no knowledge, at compile (preprocess) time, of the driver that is going to be used. I will try to come up with a plan and share it here.

GitMensch commented 1 year ago

Sounds good, the same is likely true for the other access mode (single read?) you've mentioned. Isn't it?