geonetwork / core-geonetwork

GeoNetwork is a catalog application to manage spatially referenced resources. It provides powerful metadata editing and search functions as well as an interactive web map viewer. It is currently used in numerous Spatial Data Infrastructure initiatives across the world.
http://geonetwork-opensource.org/
GNU General Public License v2.0
430 stars 489 forks source link

Connection pool issue with resource download #8441

Open ianwallen opened 1 month ago

ianwallen commented 1 month ago

Describe the bug There seems to be an issue with connection pool holding on to the connections while user is downloading files.

Since each download requires a db connection, it does not take long for users to exhaust the jdbc connection pool.

In our case the issue is with cmis being slower than expected due to external API's access to the files. This issue most likely exists with all external storage such as CMIS/JCLOUD/S3 where the performance is not as fast as local file system. Also, in our case we are using large uploaded files therefor there can be lots of users holding on to connections for several minutes while they download large files. (This issue applied to all filesystem storage)

The current logic seems to be

To Reproduce Steps to reproduce the behavior: In order to reproduce this issue easier during testing, some code changes are required to be able to reproduce the issue easier.

  1. First modify the server so that it is slower on the resource download. i.e. In this case we added a sleep on the api. image Also set the JDBC connections low.

    image

  2. Load the application and with some sample data. Possibly including some 100MB files (although running this on a local filesystem may not make a big difference)

  3. Execute multiple resource downloads. Based on this sample, you would need at least 6 concurrent downloads.

  4. Once the connection pool is exhausted, you should get errors in the logs or it will get slower and slower to download the data.

Expected behavior I would expect that the file downloads should not be linked 100% to the jdbc connection pool. True that during the beginning of the request to download, the system has to check database for permissions and other stuff. However prior to sending the result stream the connection pool should be closed. Or the download portion should be done within another thread where the jdbc is not required.

The end goal is to do the following.

Log file In our production we are seeing error like the following

org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection
java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
java.util.NoSuchElementException: Timeout waiting for idle object

Desktop (please complete the following information):