Connection logging should log disconnections (N)

wildtayne commented 6 years ago

Connection logging should use log_disconnections to log disconnects.

Setting log_disconnections to on does not require a server restart, but it only effects sessions started after it is set.

wildtayne commented 6 years ago

Currently, ClassDB.ConnectionActivity only records the date/and user for the corresponding connection. If disconnections are placed in this table, all derived functions/views will displays them as connections as well. I see three options:

Treat disconnections as connections once they are logged
Create a separate DisconnectionActivity table and update all corresponding views
Create a ActivityType column or such in ConnectionActivity that specifies if the row is a connection or disconnection. Then update all corresponding views

I believe option 1 requires relatively little effort, while options 2 and 3 require more effort and are about equal to one another.

smurthys commented 6 years ago

I think Option 3 in @srrollo's list is better and recommend altering table ConnectionActivity as follows:

Add attribute ActivityType CHAR(1) DEFAULT 'C' CHECK (ActivityType IN ('C', 'D'));
Rename attribute AcceptedAtUTC to ActivityAtUTC
Change comments associated with the table and attributes to a neutral "related to connection" or such

I also recommend changing the script to alter the table if it already exists, and create if it does not exist. This approach retains existing data.

smurthys commented 6 years ago

Actually, I am curious about Option 1 in @srrollo's list. What does "Treat disconnections as connections" mean? I think it will help to see a log entry/record disconnection.

Also, we might need to add session_id to ConnectionActivity table so we can match disconnections with connections. Unless, there is another method to match.

wildtayne commented 6 years ago

Option 1 is the essentially the simplest possible retrofit: all tables and views are left as they are now. importConnectionLog inserts a new row into ConnectionActivity when it reads connect or disconnect from a log line. The result of option 1 is not very polished, but it could serve as a stopgap if we want to address this issue in two parts: first we enable disconnection logging and implement importing, then we update the data design and views.

I agree that that option 3 produces the best final result, and the proposed changes by smurthys are good. I will look into using session_id . I also suspect that will be the best way to match connections and disconnections.

wildtayne commented 6 years ago

I also think it would be helpful to define what we want to do with logged disconnections. Some possible uses:

Raw data used as summary data (ie. # of disconnections, last connection time)
Used with connection data to determine if a student is currently connected to the DBMS

afig commented 6 years ago

I also side with implementing option 3, including using the suggested implementation.

If I understand it correctly, option 1 effectively does not distinguish between connections and disconnections and considers them both as "connection activity".

It would be interesting to see what kind of summary data could be created. However, I am not sure if using logged connections/disconnections is the best way to determine who is presently connected to the DBMS, since one discrepancy in logging (e.g. a power outage) will lead to incorrect information. There also exist more direct ways to accessing this information (see implementation of ClassDB.listUserConnections())

wildtayne commented 6 years ago

As mentioned in #214, I believe it is not possible for Postgres to log application_name in the connection log rows. It appears that after the connection is accepted, SET application_name TO <name> (or something analogous) is performed internally. The upshot of this is that disconnections do log application_name, so my proposal is this:

The changes made in @smurthys original comment are made to ClassDB.ConnectionActivity
A column sessionID is added to ClassDB.ConnectionActivity. I need to double check, but I believe the session ids are alphanumeric and fixed length, so VARCHAR(X) should be OK.
A column applicationName is added to ClassDB.ConnectionActivity. I think there is a max length for this documented.
The current connection import query is expanded with the following features:
1. Set sessionID for all rows added to ClassDB.ConnectionActivity
2. Set applicationName - should always be NULL for connections
3. Set the correct value of activityType based on the log message (connect or disconnect)
A second query is added that SELECTS every connection rows with a sessionID matching a newly imported disconnection, and updates the connection row's applicationName.

I believe the will reliably record all three attributes for every activity row. The one downside is that new connections will not have an associated applicationName until they disconnect. However, as @afig mentioned, it is more appropriate to use pg_stat_activity and our derived views to get info about currently connected users.

I would really appreciate feedback on this approach.

afig commented 6 years ago

I also agree with the changes proposed by @smurthys
Looks like VARCHAR(9) would work:

The %c escape prints a quasi-unique session identifier, consisting of two 4-byte hexadecimal numbers (without leading zeros) separated by a dot. The numbers are the process start time and the process ID... 19.8. Error Reporting and Logging
I quoted this yesterday in issue #163: it's the same as NAMEDATALEN, which is usually 64 characters (like how we deal with SQL identifiers, we can assume it is set to its default, with corresponding comments about our assumption).
Sounds good. Are we going to assert that it is NULL for connections? I don't see much benefit in doing so
Also sounds good. It would likely have to account for the fact that there may still be some connections open without a corresponding disconnection. I do not believe it is reasonable to require that there are no students connected to the DB when importing logs.

wildtayne commented 6 years ago

Thanks for the comments @afig. Having the exact values for those columns is helpful.

No reason to assert that applicationName is NULL, however I believe for the purposes of ClassDB we can make this assumption. I have found use cases where application_name is changed at runtime to make it easier to monitor various portions of complex scripts, but I don't think this is much of a concern for ClassDB. We may want to look into disallowing students from using SET application_name, however.
It should be OK if import is run when connections are open, because the update will only be initiated by new disconnection rows (eg. using RETURNING). So any open connections will just be left with a NULL applicationName. We may want to document that information on open connections is 'incomplete' however.

We would also need to keep this in mind for the test scripts. For example, testConnectionActivityLogging.psql currently runs import as a test instructor, and counts that connection as one expected to be found. if we use @smurthys suggestion of a custom applicationName to filter new connection records, then this connection would be missed due to applicationName being NULL until disconnect.

smurthys commented 6 years ago

I agree there is no reason to assert app name. For both connection and disconnection messages, simply import whatever is in the app name column.

To reduce development/testing effort, at this point, I don't think it is necessary to patch app name in connection rows from disconnection rows, but it is OK to do so though as @afig says the patching is not effective for a connection that doesn't (yet) have a corresponding disconnection.

BTW, I am curious to know if there are instances of connection and/or disconnection log entries with more than one line, and if perhaps app name is set in the 2nd line or later for connection entries. Just wondering.

wildtayne commented 6 years ago

There are usually 2-3 lines per connection, none of which have application_name set. There is only 1 disconnection line, which does have application_name set.

Edit: Also, as can be seen in the sample logs I posted, the connection authorized line is the one which signifies the connection has open successfully.

wildtayne commented 6 years ago

Actually, based on the docs @afig linked, SessionID should probably be VARCHAR(17), since it consists of two 4-byte hexadecimal numbers without leading zeroes. We're generally seeing the PID part of SessionID be 3 digits because most systems max out PID at 32768 (and don't have a ton of processes running), but it is possible to configure a system with a greater range.

wildtayne commented 6 years ago

I just encountered a bit of an issue with the upgrade ALTER TABLE statements for ClassDB.ConnectionActivity. I was thinking SessionID needs to be NOT NULL. However when you add this to a table with existing rows, you get an error because SessionID is NULL in the existing rows. I was thinking maybe we could add a query that sets SessionID to something like 00000000.00000000 where SessionID is NULL. This value should never be encountered "in the wild", because the timestamp part is stored as epoch time. Rather, we could add a temporary default of 00000000.00000000 and then DROP it immediately.

smurthys commented 6 years ago

Indeed the DBMS would not permit a new NOT NULL without a default non-NULL value. Thus, I agree the only alternatives are to permit NULL or insert a safe default value. However, I am not sure I understand the bit about "DROP it immediately" in @srrollo's comment.

BTW, I don't readily see any problem permitting NULL in the SessionID column because the table can be populated only by the ClassDB role. Am I missing something?

I will leave here a starter-set of queries for those wanting to experiment.

CREATE TABLE t(a INTEGER);
INSERT INTO t VALUES (5), (7), (8);
ALTER TABLE t ADD COLUMN b VARCHAR NOT NULL; --ERROR:  column "b" contains null values
ALTER TABLE t ADD COLUMN b VARCHAR NOT NULL DEFAULT 'xyz'; --works

wildtayne commented 6 years ago

The solution I though of would be the following. Add the column with a default initially, which populates all the existing rows with the default value. Then, drop the default constraint.

--Set a temporary default to add a value to existing rows (because of NOT NULL)
ALTER TABLE IF EXISTS ClassDB.ConnectionActivity
   ADD COLUMN IF NOT EXISTS SessionID VARCHAR(17) NOT NULL DEFAULT '00000000.00000000';

--Drop the temporary default
ALTER TABLE IF EXISTS ClassDB.ConnectionActivity
   ALTER COLUMN SessionID DROP DEFAULT;

I suppose SessionID could be NULL, however that is inconsistent with the other attributes. All are NOT NULL except for ApplicationName (because it is expected to be NULL sometimes). Additionally, if SessionID is ever NULL, something is definitely wrong with either the import or log format.

smurthys commented 6 years ago

Thanks @srrollo for clarifying your thought on the drop action at end. I now understand your idea is to fill a default value only for connection entries already in the table. 👍

BTW, I assume these alterations are done only if the column did not already exist. Perhaps obvious, but just stating.

Also, I wonder if a wiki page in the repo, or a wiki tab in the team, outlining the solution is a better idea than to incrementally designing the solution here.

DASSL / ClassDB

Connection logging should log disconnections (N) #206