DASSL / ClassDB

An open-source system to let students experiment with relational data
https://dassl.github.io/ClassDB/
Other
8 stars 2 forks source link

Connection Log Import Improvements #214

Closed wildtayne closed 6 years ago

wildtayne commented 6 years ago

This PR contains changes that address #165, #200, and #204.

General:

ClassDB.importConnectionLog()

I plan on updating the unit tests. It may not be possible to automatically test isLoggingCollectorEnabled(), since it requires a server restart to check both cases.

classdb200=# SELECT * FROM ClassDB.importConnectionLog('2018-05-27');
WARNING:  Log file for 2018-05-27 not found, skipping.
WARNING:  Log file for 2018-05-28 not found, skipping.
WARNING:  Log file for 2018-05-29 not found, skipping.
WARNING:  Log file for 2018-05-30 not found, skipping.
WARNING:  Log file for 2018-05-31 not found, skipping.
  logdate   | connectionslogged |                                     errormessage
------------+-------------------+---------------------------------------------------------------------------------------
 2018-05-27 |                 0 | could not open file "log/postgresql-05.27.csv" for reading: No such file or directory
 2018-05-28 |                 0 | could not open file "log/postgresql-05.28.csv" for reading: No such file or directory
 2018-05-29 |                 0 | could not open file "log/postgresql-05.29.csv" for reading: No such file or directory
 2018-05-30 |                 0 | could not open file "log/postgresql-05.30.csv" for reading: No such file or directory
 2018-05-31 |                 0 | could not open file "log/postgresql-05.31.csv" for reading: No such file or directory
 2018-06-01 |                 0 |
 2018-06-02 |                 0 |
 2018-06-03 |                 0 |
 2018-06-04 |                 7 |
 2018-06-05 |                 0 |
 2018-06-06 |                 0 |
 2018-06-07 |                 5 |
(12 rows)
smurthys commented 6 years ago

Nicely done @srrollo. The sample results you show look really helpful.

A few thoughts:

wildtayne commented 6 years ago

Thanks for the feedback @smurthys, I've pushed some commits that address your comments. I'm not sure if the HINT in the new warning messages is the most appropriate. It could be more direct (eg. Try setting log_connections TO 'on'), however "Managing Log Files" now contains good information about each log setting and why they need to be enabled.

smurthys commented 6 years ago

The revisions look good @srrollo.

addConnectionActivityLoggingReco.sql:

testAddConnectionActivityLogging.psql:

A recommended testing strategy:

smurthys commented 6 years ago

In function importConnectionLog I suggest using the following format to inform users of potential issues in logging. This approach results in only one output message regardless of the number of parameters turned off:

Here is a draft of the query to get a list of parameters turned off.

SELECT string_agg(name, ',')
FROM pg_settings
WHERE name IN ('logging_collector', 'log_connections', 'log_disconnections')
      AND setting = 'off';
wildtayne commented 6 years ago

I've pushed some commits improving the error messages and indentation in addConnectionActivityLogging.sql.

The RETURN QUERY is performed prior to the DROP TABLE statements because it needs to select from ImportResult, which is one of the tables dropped. Additionally, RETURN QUERY doesn't actually cause the function to exit, it only sets the value of the output table (see here). I've added an explicit RETURN statement at the end and some comments to clarify this usage, since it is pretty counter-intuitive.

I would prefer to use the ON COMMIT DROP property of temporary tables instead of explicitly dropping them, however this seems to not actually drop the tables when importConnectionLog() is called multiple times inside a transaction or function (until the end of the outer transaction/function, presumably).

smurthys commented 6 years ago

Thanks @srrollo for clarifying the need to drop tables after RETURN QUERY. The comments and the URL are quite helpful.

@srrollo: It will be great if you write a gist about this topic with emphasis on the use of a temp table. Such a gist will be good public service. I (and am sure others) will be happy to review the gist before publication. Not urgent, but important.

Some observations on the revised addConnectionActivityLoggingReco.sql:

smurthys commented 6 years ago

I recommend merging this PR before making the changes needed to import disconnections and make other related changes. I have a feeling we haven't yet fully analyzed the information needs related to disconnections, and don't want to bog down this PR unnecessarily. A new branch can address importing disconnection data.

wildtayne commented 6 years ago

I agree. I'm expecting the full scope of the disconnection changes to be fairly large

afig commented 6 years ago

Overall the work on this branch looks great. I also agree that disconnection-logging development should be completed separately. In addition to the comments already made, I have the following suggestions (I have also pointed out some minor spelling errors that haven't been addressed yet, but may have already been noticed):

addConnectionActivityLoggingReco.sql:

testConnectionActivityLogging.psql:

wildtayne commented 6 years ago

Thanks for all the reviews. I believe I have addressed all of of the comments in addConnectionActivityLogging.sql . @afig I have kept the warning message for missing log files as-is, however. I think this makes sense, because the warning gives an easier to understand error, while the report has the details.

wildtayne commented 6 years ago

Huge thanks for the :USER tip @smurthys, it allows for a big improvement to the test script.

I just pushed several commits that greatly improve the test script. Changes to the procedure are as follows:

I still have some reservations on it I would like to get feedback on:

  1. The test makes changes to ClassDB.ConnectionActivity that cannot be rolled back. This is because the test must compare the state of this table before and after the connections happen. The 'after' changes are currently rolled back, but if one runs the test again, these changes will be imported during the 'before' stage.

  2. The test still checks for exactly four connections in the return table from ClassDB.importConnectionLog(). I understand the issues brought up with this implementation, but I believe that this method is currently best. The issue with looking directly at ClassDB.ConnectionActivity with a filter for the test users is that connections from previous test runs could still be imported. Thus, we would still have to filter the connections by time as well, and I don't know of a good way to save the start time of the test (but am open to suggestions). Plus, the current method tests the return table from ClassDB.importConnectionLog()

  3. There are some things the script does not test, notably importing a missing log. The issue here is picking a log file that is guaranteed to be missing (and I don't think this is currently possible)

I don't think any of these issues are critical, but they should be kept in mind.

wildtayne commented 6 years ago

I just pushed some more commits updating the tests. I've added a dynamic function that checks ClassDB.ConnectionActivity for correct entries after the import. The dynamic creation allows the timestamp at test start time be hard-coded in the function. This allows the start time to persist across the connection changes. I also added a similar stub that can be used to check disconnection entries, when they are implemented.

A few notes:

smurthys commented 6 years ago

Awesome changes @srrollo. I super like the use of \setenv PGPASSWORD. It is amazing how much we (can) learn everyday.

The dynamic functions are neat. I wonder if the testing user can instead just create a table with one column containing the test start time, and use a join query later to verify.

Not that we need to address it in code, but L92 has an interesting requirement: the entire test must be completed on the same calendar day. 😱 . I suppose we can just remember to start this test at least 5 minutes before mid-night. 😄

I am not quite sure if it is necessary to switch the connection back to the testing user at the end, because an instructor should be able to drop all students, and also other objects with appropriate grants earlier.

Lastly, I think it is OK to assume all tests are single user, and quite likely run in a dedicated DB: the installation scripts make it really easy to do that.

smurthys commented 6 years ago

I have been thinking about the following alternative for a couple of hours now and think it can work to simplify the verification step. The idea hinges on using "application name" in each test connection as I just described in a comment elsewhere.

The idea is simple: specify a "unique" string as the application name in each test connection and then verify that ConnectionActivity table has one and only one entry for each test connection tried.

The proposed approach eliminates the need to communicate/retain a "nexus" table or a dynamic functions across connections.

BTW: I don't know if application name shows up in disconnection entries, but that should not be a problem if session_id is also captured (which will be the subject of a different PR).

wildtayne commented 6 years ago

I think the application name idea is perfect, unfortunately, after rather thorough testing it appears that application_name is not set until after the connection is made (so it cannot be logged in the connection log row, I will elaborate in #206). I have an alternate idea that builds off using application_name, which I will elaborate on in #206. Since this solution will rely on implementing both tests h session id and disconnection logging, I propose removing the two tests implemented using dynamic queries, then merging this PR. I will then implement session id/disconnect logging witin a new branch.

I agree the CURRENT_DATE requirement is a little funky, however I think it makes the test more predictable. Otherwise, it is possible running the test could import a large number of logs. Additionally, running the test while ClassDB.ConnectionActivity is empty is effectively the same as limiting the import to CURRENT_DATE.

I ended up having the original user drop all the roles because the instructor can't drop itself. I figured it was easier to just drop everything in one shot.

smurthys commented 6 years ago

Too bad application name does not show in connection entries. With this knowledge, I restate my earlier proposal of replacing the dynamic functions with a table the testing user creates.

wildtayne commented 6 years ago

I've pushed a commit that uses a table for test start time storage instead of dynamic functions.

afig commented 6 years ago

Progress so far looks really good. I did notice one small issue when running testConnectionActivityLogging.psql, specifically the second time I ran it.

I was getting FAIL: Code 2 whenever the ClassDB.ConnectionActivity table had recent entries, including when the test script was run more than once. This seems to be caused by an unnecessary timezone conversion on L127. Specifically, when CURRENT_TIMESTAMP (whose datatype is TIMESTAMP WITH TIMEZONE) is inserted into the TestConnectionActivtiyLogging table, since the data type of the column is TIMESTAMP (without timezone), it is stored in local time, rather than UTC, making the conversion on L127 incorrect.

I also noticed a tiny issue in a comment in addConnectionActivityLoggingReco.sql:

wildtayne commented 6 years ago

Really good catch @afig. A just pushed a commit fixing this issue and cleaning up some comments.

smurthys commented 6 years ago

A couple of quick observations on the revised testConnectionActivityLogging.psql:

wildtayne commented 6 years ago

Updated tests based on @smurthys suggestions

smurthys commented 6 years ago

Quickly: L139 should just say SELECT *. Selecting COUNT(*) will always return a row.

wildtayne commented 6 years ago

Just pushed a commit fixing this, had to SELECT UserName because of the GROUP BY.

smurthys commented 6 years ago

Thanks @srrollo for the revisions. I think the comment preceding the group-by query needs to be more specific about the reason for the query.

wildtayne commented 6 years ago

Thanks for your continued feedback @smurthys. I just updated the comment on that query.

afig commented 6 years ago

I like the changes to test 2. However, the comment seems to be incorrect (or perhaps I am misunderstanding the query). It seems like the query is only testing if there are too many rows, rather than also testing to see if there are too few (if there are 0 entries, then they still will not be SELECTed).

wildtayne commented 6 years ago

Thanks for the catch. The comment is incorrect ATM, I meant to change it last commit. However, I realized the real issue is since each user only has one connection, a user with too few connections imported won't be selected at all, regardless of the HAVING filter. So I will probably fix this by doing multiple connections for each user, and change to HAVING <> #connections.

smurthys commented 6 years ago

I think multiple connections are not necessary. Instead, we can use another IF statement to check if connection records are missing for any user and simply raise fail code 3.

The grouping query finds users who have more connection entries than expected: we expect one and exactly one entry for each user.

wildtayne commented 6 years ago

I just made a few comment improvements and added two new tests:

  1. Checks that each ClassDB user has new associated activity rows added
  2. Checks that the non-ClassDB user has no new associated rows added
smurthys commented 6 years ago

May be it is just formatting, but I am bit unsure about the query in L162-190.

Also, a simpler outer join can find users with missing connection entries. The following starter query works.

Lastly, I just realize the test is missing a check that no connection entry exists for the non-classDB user.

WITH ClassDBUser AS 
( SELECT * FROM regexp_split_to_table('constu01 constu02 conins01 condbm01', E'\\s+') AS UserName)
SELECT U.UserName 
FROM ClassDBUser U LEFT OUTER JOIN ClassDB.ConnectionActivity C ON U.UserName = C.UserName
WHERE C.UserName IS NULL
      AND TRUE --change to filter on timestamp
wildtayne commented 6 years ago

Thanks, that is a much simpler query. I was trying to think of a way to get the usernames into a table, I forgot about regexp_split. EDIT: I've pushed a commit implementing the query.

Also, I believe L193-206 checks if the non-ClassDB user has any connection entries, unless I am misunderstanding.

smurthys commented 6 years ago

Yes, it is rather easy to forget functions such as regexp_split. In any case, I feel the simplicity comes from the outer join, because the CTE can easily be defined as a simple UNION query:

SELECT 'a' UNION SELECT 'b' UNION SELECT 'c' UNION SELECT 'd'

BTW, the tests in L50-L67 could be simplified using ClassDB functions isUser, and isServerRoleDefined.

EDIT: Functions isStudent, isInstructor, etc. are also available.

smurthys commented 6 years ago

@srrollo: Sorry I didn't acknowledge this earlier. Yes, the test for non-ClassDB role existed. I had missed it.

wildtayne commented 6 years ago

Thanks for reminding me about the ClassDB user functions @smurthys. I've integrated these into the tests. I've also improved a few comments.

I ended up removing the CURRENT_DATE parameter from importConnectionLog() in the tests, as I don't think it is really necessary after thinking about it. This leaves the Midnight edge case only when running the test with an empty ClassDB.ConnectionActivity.

wildtayne commented 6 years ago

I'll leave this open a bit longer if anyone else wants to review. I will probably merge at ~3:30 when I get home in the interest of time however.

wildtayne commented 6 years ago

Thanks for the reviews everyone. :+1: