DASSL / ClassDB

An open-source system to let students experiment with relational data
https://dassl.github.io/ClassDB/
Other
8 stars 2 forks source link

Guard uses of pg_event_trigger_ddl_commands() for Postgres 9.4 or earlier #260

Closed KevinKelly25 closed 6 years ago

KevinKelly25 commented 6 years ago

This PR adds a gaurd for pg_event_trigger_ddl_commands() in addDDLActivityLoggingReco.sql by checking server version. If the server version is pre-9.5 the objId will be set to N/A.

N/A is probably not the best name for it and I am open to suggestions. It should contain at least one illegal identifier character so that it can not be confused for a valid object name.

I had to also update testDDLActivityLogging.sql to support these changes. In order to allow users to check server version I had to GRANT temporary execution permission to related server version functions.

fixes #234

smurthys commented 6 years ago

Thanks @KevinKelly25 for getting the changes started.

I see an issue with the change: I believe initializing the variable objId at declaration to the marker value causes two entries when the user executes a DROP query, because the server triggers both a ddl_command_end and a sql_drop even for such queries. This odd behavior is documented in the comment on L77. Given this oddity, no change should be necessary in the test script.

The correct way is to conditionally set objId. I propose using the following code structure:


IF TG_EVENT = 'ddl_command_end' THEN
   --function pg_event_trigger_ddl_commands was introduced in pg9.5
   -- remove the test for server version when support for pg9.4 ends
   IF ClassDB.isServerVersionBefore('9.5') THEN
      objId = 'N\A';
   ELSE
      SELECT object_identity --Get the statement target object
      INTO objId
      ...
   END IF;
ELSIF TG_EVENT = 'sql_drop' THEN
   SELECT object_identity --Same thing, but for drop statements
   INTO objId
   ...
END IF;
smurthys commented 6 years ago

I just added a compatibility note to the docs page so it it does not fall through the cracks.

I suggest adding similar compatibility notes to docs where users see a difference due to server version: enabling and disabling connection logging is the only other situation that comes to my mind.

afig commented 6 years ago

Thanks for beginning to add this guard @KevinKelly25. I agree with @smurthys's analysis that N/A should be conditionally assigned to objId, which will avoid needing to make the changes test script. Comments will need to be updated alongside the fixes to the code.

I believe N/A will work as a placeholder value, especially since it is now documented.

smurthys commented 6 years ago

@KevinKelly25 I just realized I had not paid attention to the following note in your PR. I am unclear on why this temporary change is needed. May be I need to look at the tests more carefully:

In order to allow users to check server version I had to GRANT temporary execution permission to related server version functions.

KevinKelly25 commented 6 years ago

Thank you for the comments and the advice. The change definitely helped with versions > 9.4 to not duplicate drop inserts. However, even with the conditionally assigned objId within the function in 9.4 the trigger will be called twice and since it is version < 9.5 objId will be set to N/A. Since this is not null it still will be logged twice. Once for N/A and another for the real object name.

Since the drop table is logged twice the tests were updated to include version checking which requires the temporary grant statements. However, this only masks the problem that duplicate drop statements are being added into the table. Because of this error I believe I am going to need to restructure the function to avoid this duplication. I am still unsure of the best way to do this and am open to suggestions. I will update this PR once I am able to find a good solution.

Edit: To clarify, without duplication the updates to test script will be unneeded and removed.

afig commented 6 years ago

Sorry for not having realized that issue earlier @KevinKelly25.

One simple solution is to add a conditional that ignores the sql_drop event if the server version is < 9.5. This does end up throwing away data (since it seems that it was possible to get object identifiers from sql_drop even versions < 9.5), but doing so actually ends up being more consistent with the new statement in the docs that "The column DDLObject contains the fixed value N/A for rows logged in Postgres 9.4 or earlier."

Extending the example provided by @smurthys, the structure would look like:

IF TG_EVENT = 'ddl_command_end' THEN
   --function pg_event_trigger_ddl_commands was introduced in pg9.5
   -- remove the test for server version when support for pg9.4 ends
   IF ClassDB.isServerVersionBefore('9.5') THEN
      objId = 'N\A';
   ELSE
      SELECT object_identity --Get the statement target object
      INTO objId
      ...
   END IF;
--sql_drop event should be ignored prior to pg9.5 to prevent duplicate entries
-- remove the test for server version when support for pg9.4 ends
ELSIF TG_EVENT = 'sql_drop' AND ClassDB.isServerVersionAfter('9.4') THEN
   SELECT object_identity --Same thing, but for drop statements
   INTO objId
   ...
END IF;

Note that this assumes that TG_TAG contains informative data the second time that a drop event executes the trigger (when ddl_command_end is the TG_EVENT) I believe it should, but I have not verified.

A slightly more complicated solution that preserves object identifiers for drop events even before version 9.5 is to add a conditional that ensures the ddl_command_end event did not occur because of drop. Something like:

IF TG_EVENT = 'ddl_command_end' THEN
   --function pg_event_trigger_ddl_commands was introduced in pg9.5
   -- remove the test for server version and NOT LIKE 'DROP%' when support for pg9.4 ends
   IF ClassDB.isServerVersionBefore('9.5') AND (TG_TAG NOT LIKE 'DROP%') THEN
      objId = 'N\A';
   ELSE
      SELECT object_identity --Get the statement target object
      INTO objId
      ...
   END IF;
ELSIF TG_EVENT = 'sql_drop' THEN
   SELECT object_identity --Same thing, but for drop statements
   INTO objId
   ...
END IF;

Unfortunately, it seems like this second solution will still result in duplicate entries for ALTER TABLE and ALTER FOREIGN TABLE according to the event trigger firing matrix.

smurthys commented 6 years ago

Thanks @KevinKelly25 for the update.

A few things established so far in these comments:

I suggest making the revision in the first alternative @afig proposes. That solution does hinge on TG_TAG being the same in both calls caused by sql_drop but that is easily verified by implementing the proposed revision. I have some alternatives in mind if this approach fails, and we can discuss that after the test if necessary.

KevinKelly25 commented 6 years ago

Thank you for the detailed analysis and suggestions @afig and @smurthys . I have used the suggested first solution as @afig describes.

This is the overview of what is now being done:

smurthys commented 6 years ago

Thanks @KevinKelly25 for the update. Please confirm that pre9.5 drop operations correctly records the actual DDL statement albeit with N/A for object name.

Also, please test with an ALTER TABLE statement consulting the event matrix @afig linked earlier. Actually the test script should include an ALTER TABLE, but we should probably just log an issue now and address it a later milestone.

KevinKelly25 commented 6 years ago

Sorry for not mentioning this in the last comment.

I have tested with a student account, an instructor account and DB manager account. Adding a table, Altering it and then dropping it. Each account created 3 inserts into the table for a total of 9. Each entry was recorded properly with N/A for pg9.4.

Side Note: I have found an issue in a comment in testDDLActivityLogging.sql ln 38 that says "If COUNT(*) > 1, then too many connection rows were added for that user" which should be 2 instead of 1 to correctly represent the code below it. However, I do not think this should be included in the PR as I am no longer altering the test script at all. It will have to altered when we also update the test to include the alter table statements.

afig commented 6 years ago

I tested the latest revision on pg9.4 and pg9.6 and everything worked well. Entries had N/A object identifiers for rows logged in pg9.3, and no rows were duplicated in either version (with one exception, see below). Tests also passed on both versions. I agree that the issue in the test comments should be addressed in a different PR.

As mentioned above, there is one instance where rows are duplicated, but this issue is somewhat unrelated to this PR. As stated by the trigger matrix linked above, sql_drop is triggered for ALTER TABLE and ALTER FOREIGN TABLE statements, but it turns out that it is only triggered for drop-related alterations (ALTER TABLE ... DROP COLUMN..., ALTER TABLE ... DROP CONSTRAINT..., possibly others). This means that in pg9.5 or higher, duplicate log entries are added for these drop-related table alterations (they are not added for versions below 9.5 because with this PR we now ignore all sql_drop events in versions <9.5).

In pg9.6:

classdb=> CREATE TABLE testtable (col1 VARCHAR);
CREATE TABLE
classdb=> SELECT * FROM myddlactivity;
     statementstartedat     |   sessionid   | ddloperation |     ddlobject
----------------------------+---------------+--------------+-------------------
 2018-08-15 13:37:10.205201 | 5b7464b8.32b0 | CREATE TABLE | insdrop.testtable

classdb=> ALTER TABLE testtable ADD COLUMN col2 VARCHAR;
ALTER TABLE
classdb=> SELECT * FROM myddlactivity;
     statementstartedat     |   sessionid   | ddloperation |     ddlobject
----------------------------+---------------+--------------+-------------------
 2018-08-15 13:37:32.391041 | 5b7464b8.32b0 | ALTER TABLE  | insdrop.testtable
 2018-08-15 13:37:10.205201 | 5b7464b8.32b0 | CREATE TABLE | insdrop.testtable

classdb=> ALTER TABLE testtable DROP COLUMN col2;
ALTER TABLE
classdb=> SELECT * FROM myddlactivity;
     statementstartedat     |   sessionid   | ddloperation |       ddlobject
----------------------------+---------------+--------------+------------------------
 2018-08-15 13:37:46.749256 | 5b7464b8.32b0 | ALTER TABLE  | insdrop.testtable.col2
 2018-08-15 13:37:46.749256 | 5b7464b8.32b0 | ALTER TABLE  | insdrop.testtable
 2018-08-15 13:37:32.391041 | 5b7464b8.32b0 | ALTER TABLE  | insdrop.testtable
 2018-08-15 13:37:10.205201 | 5b7464b8.32b0 | CREATE TABLE | insdrop.testtable

classdb=> DROP TABLE testtable;
DROP TABLE
classdb=> SELECT * FROM myddlactivity;
     statementstartedat     |   sessionid   | ddloperation |       ddlobject
----------------------------+---------------+--------------+------------------------
 2018-08-15 13:37:56.992294 | 5b7464b8.32b0 | DROP TABLE   | insdrop.testtable
 2018-08-15 13:37:46.749256 | 5b7464b8.32b0 | ALTER TABLE  | insdrop.testtable.col2
 2018-08-15 13:37:46.749256 | 5b7464b8.32b0 | ALTER TABLE  | insdrop.testtable
 2018-08-15 13:37:32.391041 | 5b7464b8.32b0 | ALTER TABLE  | insdrop.testtable
 2018-08-15 13:37:10.205201 | 5b7464b8.32b0 | CREATE TABLE | insdrop.testtable

I will log a separate issue for this later today.

Edit: Issue #264 discusses this last point.

smurthys commented 6 years ago

The changes look good except for a tiny issue: The string constant on L61 says N\A (back slash) instead of N/A (forward slash). I believe this is a copy-paste issue rooted in my proposed solution earlier in this thread. 😞

KevinKelly25 commented 6 years ago

I should have caught that one especially since I didn't actually copy paste I just wrote it over. Sorry for missing that one. Thank you for the for noticing that @smurthys. I fixed that issue.

KevinKelly25 commented 6 years ago

Thank you for the reviews.