AtlasOfLivingAustralia / image-service

Image repository and tiling services
https://images.ala.org.au
0 stars 17 forks source link

Images fails to connect to postgres after package upgrade #39

Closed ansell closed 5 years ago

ansell commented 6 years ago

Images fails to connect to postgres after a package upgrade and shows the following error in its log files:

2018-06-05 06:43:45,679 [quartzScheduler_Worker-2] ERROR util.JDBCExceptionReporter  - FATAL: terminating connection due to administrator command
2018-06-05 06:43:45,679 [quartzScheduler_Worker-2] ERROR util.JDBCExceptionReporter  - An I/O error occurred while sending to the backend.
2018-06-05 06:43:45,680 [quartzScheduler_Worker-2] ERROR util.JDBCExceptionReporter  - This connection has been closed.
2018-06-05 06:43:45,681 [quartzScheduler_Worker-2] INFO  images.LogService  - Exception thrown in job handler
2018-06-05 06:43:45,688 [quartzScheduler_Worker-2] INFO  images.LogService  - Error: could not inspect JDBC autocommit mode; nested exception is org.hibernate.exception.JDBCConnectionException: could not inspect JDBC autocommit mode
2018-06-05 06:43:45,688 [quartzScheduler_Worker-2] INFO  core.JobRunShell  - Job GRAILS_JOBS.au.org.ala.images.ProcessBackgroundTasksJob threw a JobExecutionException: 
org.quartz.JobExecutionException: org.springframework.dao.DataAccessResourceFailureException: could not inspect JDBC autocommit mode; nested exception is org.hibernate.exception.JDBCConnectionException: could not inspect JDBC autocommit mode [See nested exception: org.springframework.dao.DataAccessResourceFailureException: could not inspect JDBC autocommit mode; nested exception is org.hibernate.exception.JDBCConnectionException: could not inspect JDBC autocommit mode]
        at grails.plugins.quartz.GrailsJobFactory$GrailsJob.execute(GrailsJobFactory.java:111)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: org.springframework.dao.DataAccessResourceFailureException: could not inspect JDBC autocommit mode; nested exception is org.hibernate.exception.JDBCConnectionException: could not inspect JDBC autocommit mode
        at org.grails.datastore.gorm.GormStaticApi$_methodMissing_closure2.doCall(GormStaticApi.groovy:101)
        at au.org.ala.images.SettingService.getOrCreateSetting(SettingService.groovy:109)
        at au.org.ala.images.SettingService.getSettingFromStack(SettingService.groovy:85)
        at au.org.ala.images.SettingService.getBoolSetting(SettingService.groovy:93)
        at au.org.ala.images.SettingService.getBackgroundTasksEnabled(SettingService.groovy:25)
        at au.org.ala.images.ProcessBackgroundTasksJob.execute(ProcessBackgroundTasksJob.groovy:17)
        at grails.plugins.quartz.GrailsJobFactory$GrailsJob.execute(GrailsJobFactory.java:104)
        ... 2 more
Caused by: org.hibernate.exception.JDBCConnectionException: could not inspect JDBC autocommit mode
        ... 9 more
Caused by: org.postgresql.util.PSQLException: This connection has been closed.
        at org.postgresql.jdbc2.AbstractJdbc2Connection.checkClosed(AbstractJdbc2Connection.java:820)
        at org.postgresql.jdbc2.AbstractJdbc2Connection.getAutoCommit(AbstractJdbc2Connection.java:781)
        ... 9 more
2018-06-05 06:43:45,689 [quartzScheduler_Worker-2] ERROR listeners.ExceptionPrinterJobListener  - Exception occurred in job: Grails Job
org.quartz.JobExecutionException: org.springframework.dao.DataAccessResourceFailureException: could not inspect JDBC autocommit mode; nested exception is org.hibernate.exception.JDBCConnectionException: could not inspect JDBC autocommit mode [See nested exception: org.springframework.dao.DataAccessResourceFailureException: could not inspect JDBC autocommit mode; nested exception is org.hibernate.exception.JDBCConnectionException: could not inspect JDBC autocommit mode]
        at grails.plugins.quartz.GrailsJobFactory$GrailsJob.execute(GrailsJobFactory.java:111)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: org.springframework.dao.DataAccessResourceFailureException: could not inspect JDBC autocommit mode; nested exception is org.hibernate.exception.JDBCConnectionException: could not inspect JDBC autocommit mode
        at org.grails.datastore.gorm.GormStaticApi$_methodMissing_closure2.doCall(GormStaticApi.groovy:101)
        at au.org.ala.images.SettingService.getOrCreateSetting(SettingService.groovy:109)
        at au.org.ala.images.SettingService.getSettingFromStack(SettingService.groovy:85)
        at au.org.ala.images.SettingService.getBoolSetting(SettingService.groovy:93)
        at au.org.ala.images.SettingService.getBackgroundTasksEnabled(SettingService.groovy:25)
        at au.org.ala.images.ProcessBackgroundTasksJob.execute(ProcessBackgroundTasksJob.groovy:17)
        at grails.plugins.quartz.GrailsJobFactory$GrailsJob.execute(GrailsJobFactory.java:104)
        ... 2 more
Caused by: org.hibernate.exception.JDBCConnectionException: could not inspect JDBC autocommit mode
        ... 9 more
Caused by: org.postgresql.util.PSQLException: This connection has been closed.
        at org.postgresql.jdbc2.AbstractJdbc2Connection.checkClosed(AbstractJdbc2Connection.java:820)
        at org.postgresql.jdbc2.AbstractJdbc2Connection.getAutoCommit(AbstractJdbc2Connection.java:781)
        ... 9 more
2018-06-05 06:43:46,648 [quartzScheduler_Worker-3] ERROR util.JDBCExceptionReporter  - This connection has been closed.
2018-06-05 06:43:46,649 [quartzScheduler_Worker-3] ERROR util.JDBCExceptionReporter  - This connection has been closed.

A suggestion from Simon is that the code from the following section may be useful to make phylolink more resilient to database interruptions:

https://github.com/AtlasOfLivingAustralia/volunteer-portal/blob/develop/grails-app/conf/application.yml#L303

A possible location for the local changes required may be in:

https://github.com/AtlasOfLivingAustralia/image-service/blob/master/grails-app/conf/DataSource.groovy#L37

djtfmartin commented 5 years ago

The resolution for this for phylolink was to upgrade the JDBC driver from 9.1 to later version. Images is using 9.4 so i think this should be resolved now. Closing. It can be reopened if it occurs again.

ansell commented 5 years ago

This is still an issue. It happened when restarting the server yesterday.

djtfmartin commented 5 years ago

@ansell is this still an issue ? If so Is there a method to reproduce...

ansell commented 5 years ago

It is intermittent, but it has happened multiple times. Reproducing may involve restarting postgres while tomcat is running and checking if it reconnects, and if not, try again.

djtfmartin commented 5 years ago

The original issue was raised against old grails2 version of the app which used old JDBC drivers which had some connectivity issues we saw in phylolink which used the same version.

The new version (1.0.8+) works well for VM restarts and postgres restarts - the latter tested today on NCI with start/stop and restart of service.

Im going to close this as we no means of reproducing with the current version and i believe it is fixed with the updates.