irods / irods_capability_storage_tiering

BSD 3-Clause "New" or "Revised" License
5 stars 10 forks source link

many more database connections than expected #204

Closed davebiffuk closed 8 months ago

davebiffuk commented 2 years ago

Hi,

We're using irods-database-plugin-postgres 4.2.7, irods-server 4.2.7, irods-rule-engine-plugin-storage-tiering 2.7.0

With this setting in server_config.json:

"maximum_number_of_concurrent_rule_engine_server_processes": 56

(recently increased from 32, but unclear whether the config change has been read yet) we had expected to see approximately that many connections to the supporting database. However we are actually seeing around 1080 which has raised concerns with our DBA colleagues that we might reach the connection limit for the database. The number generally corresponds to the number of irodsServer processes:

irods@irods-seq-indexing:~$ ps fauxww | grep -c irodsServer ; netstat -t | grep dbsrv6 | wc -l
1081
1082

We're wondering if there might be a connection leak or overly-long idle timeout, but not sure how to investigate. Please let me know what information I can provide to help.

Thanks, Dave

korydraughn commented 2 years ago

This should be resolved in 4.2.8. See https://github.com/irods/irods/issues/4616.

Please verify by running iRODS 4.2.8 in a test environment.

Let us know how it goes.

kript commented 2 years ago

4.2.8 (s/8/11/) isn't a near term option for us but we have multiple uses for the rule engine now. How safe is it in the intermediate term to run a script that kills long running open database connections from the delay server? Is there a way we can determine the connections came from the rule engine and not other iRODS processes?

alanking commented 2 years ago

I won't speak to the safety question, but...

Maybe you could detect the client host/IP and match it against the host/IP of the machine on which the delay server is running? This would indicate that the client connection is coming from the same computer, which means most likely that it is coming from the delay server.

alanking commented 8 months ago

@korydraughn claims this may be fixed in 4.2.8, and I agree. Is there anything more that we can do here?

korydraughn commented 8 months ago

It's been more than a year since your previous comment.

I think we can close this. Users can always open new issues.