ministryofjustice / hmpps-probation-integration-services

A collection of small, domain-focused integrations to support HMPPS Digital services that need to interact with probation data.
https://ministryofjustice.github.io/hmpps-probation-integration-services/tech-docs
MIT License
2 stars 0 forks source link

PI-2330 Fix truncated allocations report due to async timeout #4039

Closed marcus-bcl closed 3 months ago

marcus-bcl commented 3 months ago

The exception is a "Socket interrupted" error exactly 30 seconds after the request is started, while reading data from the database in the Oracle JDBC driver. The error is logged but because the response headers have already been sent, the status is still 200 OK. The interruption happens in Oracle's TimeoutSocketChannel class. When this class opens a DB connection, or starts reading data, it schedules an interruption to occur after a given socket timeout (soTimeout). This sent me down the rabbit hole that the JDBC driver must have been causing the error after 30 seconds, however the socket timeout defaults to 0 (meaning no timeout), so it wasn't that.

Turns out actually Spring was interrupting the thread after its async timeout, which defaults to 30 seconds. I definitely did look into this earlier on, but there must have been something wrong with my testing as I'm sure changing that value didn't fix it at first...

I've also added a sanity check to the cron job, to ensure the report contains records from January 2024 which will be the last records to be returned due to ordering on the SQL query - and retry if not.


For future reference Other timeouts I tried adjusting, when it seemed like a DB timeout:

server:
  tomcat:
    connection-timeout: 5m
    keep-alive-timeout: 5m
spring:
  datasource:
    hikari:
      data-source-properties:
        oracle.net.CONNECT_TIMEOUT: 300_000
        oracle.net.READ_TIMEOUT: 300_000
        oracle.jdbc.ReadTimeout: 300_000
        oracle.jdbc.defaultConnectionValidation: LOCAL
        socketTimeout: 300
  jpa:
    properties:
      jakarta.persistence.query.timeout: 300_000