NOAA-OWP / wres

Code and scripts for the Water Resources Evaluation Service
Other
2 stars 1 forks source link

As a user, I expect a source lock to be re-acquired after a connection is dropped for a source undergoing ingest #263

Open epag opened 3 months ago

epag commented 3 months ago

Author Name: James (James) Original Redmine Issue: 94128, https://vlab.noaa.gov/redmine/issues/94128 Original Date: 2021-07-13


Given an evaluation in progress that loses a database connection when a source is being ingested When that evaluation attempts to recover Then it should make a best effort to recover the source lock

( So that I can be sure that, when I see behavior like #94126, a best effort has been made to recover and that the situation is unrecoverable. )


Related issue(s): #261 Redmine related issue(s): 93204, 94168


epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-13T12:03:59Z


tl;dr, investigate #94126. Unclear on cursory inspection whether the software is working as intended or not. It appears to be working insofar as a finite number of attempts are made to re-acquire the lock on a source ingest in progress and, once exhausted, an exception is thrown. This seems logical, but I want to be sure that the semantics are working as expected, on the one hand, and cannot be made more resilient, on the other (this is a trade-off between recovery where possible and failing early when not).

Another minor issue is that the standard out attached to #94126 shows logging of complete time-series whose ingest failed, which may overwhelm the service in some cases (we have examples of this in other contexts if the time-series is sufficiently long). It should be replaced with logging of the ts metadata only.

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-13T12:06:44Z


Hank, would you mind uploading the postgres log and @check_mk@ monitoring for roughly the time spanned by #94126? Jesse is out this week and he is the author of the db locking, so it would be a more efficient use of his time than mine (edit: that is to say, I would need to invest significant time to understand it before determining whether a fix were needed and, if so, fixing it). The job url and standard out is attached to #94126. Thanks!

epag commented 3 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2021-07-13T13:23:27Z


Log is attached.

What Check_MK displays would you like to see?

Hank

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-13T13:30:11Z


Not sure what displays might be available for networking, but everything you can (or within reason, what makes sense to you) for the prod db server, I guess.

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-13T13:42:26Z


Relevant part of the postgres log:

2021-07-13 03:29:03.099 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db6 LOG:  disconnection: session time: 0:00:18.212 user=wres_user8 database=wres8 host=10.3.10.104 port=59538
2021-07-13 03:29:03.168 UTC 10.3.10.104 wres8 wres_user8 60ed031b.37ca LOG:  disconnection: session time: 0:22:59.758 user=wres_user8 database=wres8 host=10.3.10.104 port=48014
2021-07-13 03:29:03.369 UTC 10.3.10.104 wres8 wres_user8 60ed018d.32bb LOG:  disconnection: session time: 0:29:38.030 user=wres_user8 database=wres8 host=10.3.10.104 port=58870
2021-07-13 03:29:03.370 UTC 10.3.10.104 [unknown] [unknown] 60ed087f.4dcc LOG:  connection received: host=10.3.10.104 port=59046
2021-07-13 03:29:03.387 UTC 10.3.10.104 wres8 wres_user8 60ed087f.4dcc LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:29:03.446 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db4 LOG:  incomplete message from client
2021-07-13 03:29:03.446 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db4 CONTEXT:  COPY timeseriesvalue_lead_above_150, line 7376
2021-07-13 03:29:03.446 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db4 STATEMENT:  COPY wres.TimeSeriesValue_Lead_Above_150 ( lead,series_value,timeseries_id ) FROM STDIN WITH DELIMITER '|'
2021-07-13 03:29:03.446 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db4 ERROR:  unexpected EOF on client connection with an open transaction
2021-07-13 03:29:03.446 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db4 CONTEXT:  COPY timeseriesvalue_lead_above_150, line 7376
2021-07-13 03:29:03.446 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db4 STATEMENT:  COPY wres.TimeSeriesValue_Lead_Above_150 ( lead,series_value,timeseries_id ) FROM STDIN WITH DELIMITER '|'
2021-07-13 03:29:03.447 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db4 FATAL:  terminating connection because protocol synchronization was lost
2021-07-13 03:29:03.447 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db4 LOG:  disconnection: session time: 0:00:18.613 user=wres_user8 database=wres8 host=10.3.10.104 port=59536
epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-13T13:42:55Z


Looks like the problem started on the client side, but the real cause could be a few different things, I guess, including a network interruption or an abrupt close (edit: of the connection) by the client during a transaction.

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-13T13:46:14Z


In any case, whatever happen occurred within a very short period because a connection from @wres_user8@ was received and authorized a few ms later.

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-13T13:50:33Z


Ah, just spotted these two lines about PT1S earlier, which is potentially relevant too:

2021-07-13 03:29:02.244 UTC 10.3.10.104 wres8 wres_user8 60ed087d.4dc9 WARNING:  you don't own a lock of type ExclusiveLock
2021-07-13 03:29:02.307 UTC 10.3.10.104 wres8 wres_user8 60ed087d.4dc9 WARNING:  you don't own a lock of type ExclusiveLock

Looks like the client attempted to release a lock that it didn't own, so that may indicate some error in the locking semantics on the client side, not sure. Again, I think this will need to wait until Jesse returns, if I'm not to sink a ton of time into it, so just trying to gather sufficient details here.

epag commented 3 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2021-07-13T14:34:24Z


You requested Check_MK graphs in the other ticket. I've attached graphs related to CPU load and utilization, and disk I/O here. My Check_MK-foo is very limited, so I'm not really sure what else to get or how to better format them.

Hank

epag commented 3 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2021-07-13T14:37:28Z


By the way, the job Jim reported in #94126 started at 2021-07-13T03:28:48.713+0000 and ended at 2021-07-13T03:29:09.138+0000. With plots provided in CST, the time to examine is around Monday at 22:30 hours.

Hank

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-13T14:52:48Z


I did, but I was thinking in terms of network health (traffic and status). In any case, on inspecting the postgres log, it looks unlikely that we'd see anything in @check_mk@.

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-14T14:19:55Z


Some wild speculation in #94168-7. Again, I could look at this properly, but it would require me to fully review the implementation first, in order to understand what it is doing and why.

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-14T14:23:56Z


As to why this has suddenly started to appear, best guess would be nwcal networking in the recent day or two.

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-14T14:24:52Z


Btw, feel free to rename this ticket, upon investigation, once the real problem is discovered.

epag commented 3 months ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-07-19T17:02:52Z


Here is the first thing: @2021-07-13T03:29:06.839+0000 WARN DatabaseLockManagerPostgres About to restore a connection to database (1), lost org.postgresql.jdbc.PgConnection@2fa59863@

From WRES' perspective, it lost that connection.

epag commented 3 months ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-07-19T17:09:08Z


Probable section of postgres logs for full evaluation:

2021-07-13 03:28:42.731 UTC 10.3.10.104 [unknown] [unknown] 60ed086a.4da9 LOG:  connection received: host=10.3.10.104 port=59512
2021-07-13 03:28:42.731 UTC 10.3.10.104 [unknown] [unknown] 60ed086a.4da9 LOG:  incomplete startup packet
2021-07-13 03:28:42.906 UTC 10.3.10.104 [unknown] [unknown] 60ed086a.4daa LOG:  connection received: host=10.3.10.104 port=59514
2021-07-13 03:28:43.812 UTC 10.3.10.104 wres8 wres_user8 60ed086a.4daa LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.456 UTC 10.3.10.104 wres8 wres_user8 60ed086a.4daa LOG:  disconnection: session time: 0:00:01.550 user=wres_user8 database=wres8 host=10.3.10.104 port=59514
2021-07-13 03:28:44.488 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4dac LOG:  connection received: host=10.3.10.104 port=59516
2021-07-13 03:28:44.550 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dac LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.568 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4dad LOG:  connection received: host=10.3.10.104 port=59518
2021-07-13 03:28:44.598 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dad LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.669 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4dae LOG:  connection received: host=10.3.10.104 port=59520
2021-07-13 03:28:44.705 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dae LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.711 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4daf LOG:  connection received: host=10.3.10.104 port=59522
2021-07-13 03:28:44.717 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4db0 LOG:  connection received: host=10.3.10.104 port=59524
2021-07-13 03:28:44.748 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4daf LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.761 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db0 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.763 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4db1 LOG:  connection received: host=10.3.10.104 port=59528
2021-07-13 03:28:44.773 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4db2 LOG:  connection received: host=10.3.10.104 port=59530
2021-07-13 03:28:44.801 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db1 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.819 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db2 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.820 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4db3 LOG:  connection received: host=10.3.10.104 port=59534
2021-07-13 03:28:44.833 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4db4 LOG:  connection received: host=10.3.10.104 port=59536
2021-07-13 03:28:44.875 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db3 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.883 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db4 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.887 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4db6 LOG:  connection received: host=10.3.10.104 port=59538
2021-07-13 03:28:44.892 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4db7 LOG:  connection received: host=10.3.10.104 port=59540
2021-07-13 03:28:44.924 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db7 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.936 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db6 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.940 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4db8 LOG:  connection received: host=10.3.10.104 port=59542
2021-07-13 03:28:44.974 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db8 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:44.993 UTC 10.3.10.104 [unknown] [unknown] 60ed086c.4db9 LOG:  connection received: host=10.3.10.104 port=59544
2021-07-13 03:28:45.024 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db9 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:45.622 UTC 10.3.10.104 [unknown] [unknown] 60ed086d.4dba LOG:  connection received: host=10.3.10.104 port=59546
2021-07-13 03:28:45.644 UTC 10.3.10.104 wres8 wres_user8 60ed086d.4dba LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:45.652 UTC 10.3.10.104 [unknown] [unknown] 60ed086d.4dbb LOG:  connection received: host=10.3.10.104 port=59548
2021-07-13 03:28:45.677 UTC 10.3.10.104 wres8 wres_user8 60ed086d.4dbb LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4daf ERROR:  could not serialize access due to read/write dependencies among transactions
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4daf DETAIL:  Reason code: Canceled on identification as a pivot, during write.
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4daf HINT:  The transaction might succeed if retried.
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4daf STATEMENT:  INSERT INTO wres.Feature ( name, description, srid, wkt ) 
        SELECT $1, $2, $3, $4
        WHERE NOT EXISTS
        (
            SELECT 1
            FROM wres.Feature
            WHERE name = $5
                AND description = $6
                AND srid = $7 
                AND wkt = $8 
        )

    RETURNING *
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dad ERROR:  could not serialize access due to read/write dependencies among transactions
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dad DETAIL:  Reason code: Canceled on identification as a pivot, during write.
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dad HINT:  The transaction might succeed if retried.
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dad STATEMENT:  INSERT INTO wres.Feature ( name, description, srid, wkt ) 
        SELECT $1, $2, $3, $4
        WHERE NOT EXISTS
        (
            SELECT 1
            FROM wres.Feature
            WHERE name = $5
                AND description = $6
                AND srid = $7 
                AND wkt = $8 
        )

    RETURNING *
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db3 ERROR:  could not serialize access due to read/write dependencies among transactions
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db3 DETAIL:  Reason code: Canceled on identification as a pivot, during write.
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db3 HINT:  The transaction might succeed if retried.
2021-07-13 03:28:48.476 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db3 STATEMENT:  INSERT INTO wres.Feature ( name, description, srid, wkt ) 
        SELECT $1, $2, $3, $4
        WHERE NOT EXISTS
        (
            SELECT 1
            FROM wres.Feature
            WHERE name = $5
                AND description = $6
                AND srid = $7 
                AND wkt = $8 
        )

    RETURNING *
2021-07-13 03:28:48.478 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db6 ERROR:  could not serialize access due to read/write dependencies among transactions
2021-07-13 03:28:48.478 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db6 DETAIL:  Reason code: Canceled on identification as a pivot, during write.
2021-07-13 03:28:48.478 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db6 HINT:  The transaction might succeed if retried.
2021-07-13 03:28:48.478 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db6 STATEMENT:  INSERT INTO wres.Feature ( name, description, srid, wkt ) 
        SELECT $1, $2, $3, $4
        WHERE NOT EXISTS
        (
            SELECT 1
            FROM wres.Feature
            WHERE name = $5
                AND description = $6
                AND srid = $7 
                AND wkt = $8 
        )

    RETURNING *
2021-07-13 03:29:01.984 UTC 10.3.10.104 [unknown] [unknown] 60ed087d.4dc9 LOG:  connection received: host=10.3.10.104 port=59566
2021-07-13 03:29:02.096 UTC 10.3.10.104 wres8 wres_user8 60ed0193.32c3 LOG:  disconnection: session time: 0:29:31.027 user=wres_user8 database=wres8 host=10.3.10.104 port=58876
2021-07-13 03:29:02.105 UTC 10.3.10.104 wres8 wres_user8 60ed018d.32bc LOG:  disconnection: session time: 0:29:36.747 user=wres_user8 database=wres8 host=10.3.10.104 port=58872
2021-07-13 03:29:02.142 UTC 10.3.10.104 wres8 wres_user8 60ed087d.4dc9 LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:29:02.177 UTC 10.3.10.104 [unknown] [unknown] 60ed087e.4dca LOG:  connection received: host=10.3.10.104 port=59044
2021-07-13 03:29:02.200 UTC 10.3.10.104 wres8 wres_user8 60ed087e.4dca LOG:  connection authorized: user=wres_user8 database=wres8 SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES128-GCM-SHA256, compression=off)
2021-07-13 03:29:02.244 UTC 10.3.10.104 wres8 wres_user8 60ed087d.4dc9 WARNING:  you don't own a lock of type ExclusiveLock
2021-07-13 03:29:02.307 UTC 10.3.10.104 wres8 wres_user8 60ed087d.4dc9 WARNING:  you don't own a lock of type ExclusiveLock
2021-07-13 03:29:02.308 UTC 10.3.10.104 wres8 wres_user8 60ed087d.4dc9 LOG:  disconnection: session time: 0:00:00.323 user=wres_user8 database=wres8 host=10.3.10.104 port=59566
2021-07-13 03:29:02.308 UTC 10.3.10.104 wres8 wres_user8 60ed086d.4dbb LOG:  disconnection: session time: 0:00:16.656 user=wres_user8 database=wres8 host=10.3.10.104 port=59548
2021-07-13 03:29:02.392 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dac LOG:  disconnection: session time: 0:00:17.904 user=wres_user8 database=wres8 host=10.3.10.104 port=59516
2021-07-13 03:29:02.392 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dae LOG:  disconnection: session time: 0:00:17.723 user=wres_user8 database=wres8 host=10.3.10.104 port=59520
2021-07-13 03:29:02.394 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db9 LOG:  disconnection: session time: 0:00:17.400 user=wres_user8 database=wres8 host=10.3.10.104 port=59544
2021-07-13 03:29:02.394 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db0 LOG:  disconnection: session time: 0:00:17.677 user=wres_user8 database=wres8 host=10.3.10.104 port=59524
2021-07-13 03:29:02.394 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db2 LOG:  disconnection: session time: 0:00:17.620 user=wres_user8 database=wres8 host=10.3.10.104 port=59530
2021-07-13 03:29:02.394 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db7 LOG:  disconnection: session time: 0:00:17.503 user=wres_user8 database=wres8 host=10.3.10.104 port=59540
2021-07-13 03:29:02.398 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db8 LOG:  disconnection: session time: 0:00:17.457 user=wres_user8 database=wres8 host=10.3.10.104 port=59542
2021-07-13 03:29:02.592 UTC 10.3.10.104 wres8 wres_user8 60ed01a0.32d1 LOG:  disconnection: session time: 0:29:17.653 user=wres_user8 database=wres8 host=10.3.10.104 port=58880
2021-07-13 03:29:02.728 UTC 10.3.10.104 wres8 wres_user8 60ed031a.37c7 LOG:  disconnection: session time: 0:23:00.630 user=wres_user8 database=wres8 host=10.3.10.104 port=58930
2021-07-13 03:29:03.024 UTC 10.3.10.104 wres8 wres_user8 60ed086d.4dba LOG:  disconnection: session time: 0:00:17.402 user=wres_user8 database=wres8 host=10.3.10.104 port=59546
2021-07-13 03:29:03.097 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4dad LOG:  disconnection: session time: 0:00:18.528 user=wres_user8 database=wres8 host=10.3.10.104 port=59518
2021-07-13 03:29:03.098 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4daf LOG:  disconnection: session time: 0:00:18.387 user=wres_user8 database=wres8 host=10.3.10.104 port=59522
2021-07-13 03:29:03.098 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db1 LOG:  disconnection: session time: 0:00:18.335 user=wres_user8 database=wres8 host=10.3.10.104 port=59528
2021-07-13 03:29:03.098 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db3 LOG:  disconnection: session time: 0:00:18.278 user=wres_user8 database=wres8 host=10.3.10.104 port=59534
2021-07-13 03:29:03.099 UTC 10.3.10.104 wres8 wres_user8 60ed086c.4db6 LOG:  disconnection: session time: 0:00:18.212 user=wres_user8 database=wres8 host=10.3.10.104 port=59538
2021-07-13 03:29:03.168 UTC 10.3.10.104 wres8 wres_user8 60ed031b.37ca LOG:  disconnection: session time: 0:22:59.758 user=wres_user8 database=wres8 host=10.3.10.104 port=48014
2021-07-13 03:29:03.369 UTC 10.3.10.104 wres8 wres_user8 60ed018d.32bb LOG:  disconnection: session time: 0:29:38.030 user=wres_user8 database=wres8 host=10.3.10.104 port=58870
epag commented 3 months ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-07-20T21:10:36Z


Unfortunately this was able to be tested when I had VirtualBox and now it cannot be tested without, so any potential improvements will be speculative.

epag commented 3 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-07-20T21:55:29Z


I can install VirtualBox, so I may be able to help you out w/ testing if needed...

epag commented 3 months ago

Original Redmine Comment Author Name: Chris (Chris) Original Date: 2021-07-21T12:02:19Z


Unfortunately this was able to be tested when I had VirtualBox and now it cannot be tested without, so any potential improvements will be speculative.

I wonder if we should be building up a list for these sorts of events, but I don't know where we'd put them or what they'd eventually be used for. I feel like having a lengthy list will help us when IT asks the usual "WHY DO yoU EVEN NeEd vIRTuAlbOX Or dOCKer?!"

epag commented 3 months ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-07-21T17:31:40Z


James wrote:

I can install VirtualBox, so I may be able to help you out w/ testing if needed...

I guess step one would be to try it out with the current version and kill a socket from outside WRES and outside postgres and see what happens. I doubt we will be able to exactly replicate the same situation but we might be able to verify that the changes don't break anything.

epag commented 3 months ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-07-21T17:32:17Z


Chris wrote:

Unfortunately this was able to be tested when I had VirtualBox and now it cannot be tested without, so any potential improvements will be speculative.

I wonder if we should be building up a list for these sorts of events, but I don't know where we'd put them or what they'd eventually be used for. I feel like having a lengthy list will help us when IT asks the usual "WHY DO yoU EVEN NeEd vIRTuAlbOX Or dOCKer?!"

That sounds reasonable, but reason doesn't work with that group.