rbeckman-nextgen / test-mc2

0 stars 0 forks source link

Unable to Halt a database connector that hangs in 'Polling' state. #3282

Open rbeckman-nextgen opened 4 years ago

rbeckman-nextgen commented 4 years ago

During network and/or firewall problems something went wrong with a database connection of our test-environment, and a mirth-channel (database reader, with the keep-connection-option open set to true) got stuck in 'polling' state (the reason of the db-query hanging isnt a/the mirth-problem).

But the Mirth-problem was: Stopping the channel didnt work (this i expected), and Halting the channel didnt do anything too (it stayed in halted state, probably not able to terminate the db-connection, and never stopped.).

Also due to this problem, stopping/restarting mirth-connect wasnt possible anymore.

This problem maybe related to: Mirth-3366

Imported Issue. Original Details: Jira Issue Key: MIRTH-3403 Reporter: amc_cru Created: 2014-08-13T02:13:50.000-0700

rbeckman-nextgen commented 4 years ago

Driver used by the database-reader channel: jdbc:jtds:sqlserver://.....

Imported Comment. Original Details: Author: amc_cru Created: 2014-08-13T02:30:13.000-0700

rbeckman-nextgen commented 4 years ago

Others are running into this as well, also using jTDS. This is due to a couple of things. First, the database reader doesn't support halting at all right now. The delegate interface doesn't even have a halt method. Second, both the reader and writer query delegates just close the Connection object. However, that could block, depending on the driver. To do a proper halt, we should be calling the abort method, and possibly also the setNetworkTimeout method when the connection first gets created.

However, jTDS is dumb and doesn't support either of those methods. In the JtdsConnection class it literally just throws an AbstractMethodError, nothing else. So as far as I can tell, it's impossible to force-halt a hanging jTDS connection, unless we override the class and implement those methods ourselves. PostgreSQL's driver does implement them, so it should work fine there. Not sure about other drivers. Maybe Microsoft's JDBC driver does.

This is very easy to reproduce. I just use a VM with SQL Server on it, and a Database Writer channel that invokes WAITFOR. While a message is processing I suspend the VM. After that, the channel cannot be stopped or halted, and requires the entire server to be restarted.

Imported Comment. Original Details: Author: narupley Created: 2015-04-30T12:07:41.000-0700

rbeckman-nextgen commented 4 years ago

Doesn't look like Microsoft's JDBC driver supports those methods either. So as far as SQL Server goes there's nothing that can be done, unless as I said we alter jTDS to support it.

It sucks that in cases like this the channel basically can't be used at all, but the only alternative is to spawn the possibly-forever-blocking operations in a separate thread, and when something like this happens we just try our best and then forget about the thread. Then the channel will be able to stop, and can be used, redeployed, etc. It's just that in the JVM you'll still have lingering threads that could stick around forever. What's worse? A possible thread leak, or forcing the user to restart the entire server? In the thread leak case we would obviously send some error to the server log letting the user know that it's happening, and that they should restart the server when it's convenient. I think that's better than channel being in a perpetually unusable state, wherein the user is forced to either restart the server immediately, or abandon/clone the channel.

Imported Comment. Original Details: Author: narupley Created: 2015-04-30T12:29:09.000-0700

rbeckman-nextgen commented 4 years ago

[http://www.mirthcorp.com/community/forums/showthread.php?t=14253]

Imported Comment. Original Details: Author: narupley Created: 2015-05-14T07:47:43.000-0700

rbeckman-nextgen commented 4 years ago

Is there a workaround for this? We are encountering this in our production (supported) environment.

Imported Comment. Original Details: Author: justinsk Created: 2019-07-25T17:26:34.000-0700

rbeckman-nextgen commented 4 years ago

We are currently having a similar problem with Connector Type of File Server and Method of FTP. Just last week we had the same problem occur with a Connector Type of File Server and Method of FTP and Method of File and a Directory pointing to a networked server. When the network server has problems, Mirth Connect didn't handle this well and started using a so much CPU that other channels could get their work done, and we had to reboot the server. We are using Mirth Version 3.5.2. Justin Kaltenbach, what version are you using? We, too, are using a supported version.

Imported Comment. Original Details: Author: mulleg Created: 2019-08-02T11:55:14.000-0700

rbeckman-nextgen commented 4 years ago

Would like to know workaround as well. Locking up production workflow about once every 2 weeks. Occurring with the following connecting to SQL Azure using Mirth Connect Server 3.5.2, Java version: 1.8.0_121:

Database Reader

Use Javascript: No Keep Connection Open: No Aggregate Results: No Cache Results: Yes Retries on Error: 3 Retry Interval: 10000

sqljdbc42.jar

Imported Comment. Original Details: Author: sesq Created: 2019-08-28T12:32:07.000-0700