EI-CoreBioinformatics / reat

Robust Eukaryotic Annotation Toolkit
https://reat.readthedocs.io/en/latest/
MIT License
17 stars 3 forks source link

cromwell issue - exceeds configured limit of 1000000 #38

Open swarbred opened 2 years ago

swarbred commented 2 years ago

@ljyanesm sorry to bother you the wheat run failed with

[2022-07-02 02:31:52,90] [ESC[38;5;1merrorESC[0m] SingleWorkflowRunnerActor received Failure message: Metadata for workflow ac749710-58fe-4309-af0f-169dc7eaa43b exists in database but cannot be served because row count of 2752958 exceeds configured limit of 1000000.
cromwell.services.MetadataTooLargeNumberOfRowsException: Metadata for workflow ac749710-58fe-4309-af0f-169dc7eaa43b exists in database but cannot be served because row count of 2752958 exceeds configured limit of 1000000.

From a quick google this seems to relate to

https://github.com/broadinstitute/cromwell/blob/5f48ded459574c059a1ddbedb9b7df248da85052/cromwell.example.backends/cromwell.examples.conf#L519

I did have a go at updating the cromwell_noserver_slurm.conf and this errored, i may not have had the updated config correctly (if you could indicate specifically what needs to be added that would be helpful) but the error on the rerun is not to do with this anyway as rerunning with the unchanged cromwell_noserver_slurm.conf also gives the error below

I assume this relates to cromwell not sutting down correctly and the connection perhaps still being open, any way to resolve this (the wheat run had completed all the main steps when the above error was raised, so I would prefer not to have to rerun everything).

Error on restarting with original config

Starting:
java -Dconfig.file=Inputs/Configs/cromwell_noserver_slurm.conf -DLOG_LEVEL=INFO -jar Inputs/Configs/cromwell.jar run -i reat_input.json -o options.json -m run_details.json /ei/software/cb/reat/dev-issue32/x86_64/lib/python3.9/site-packages/annotation/prediction_module/main.wdl
[2022-07-03 11:50:11,09] [info] Running with database db.url =
jdbc:hsqldb:file:cromwell-executions/cromwell-db/cromwell-db;
shutdown=false;
hsqldb.default_table_type=cached;hsqldb.tx=mvcc;
hsqldb.result_max_memory_rows=10000;
hsqldb.large_data=true;
hsqldb.applog=1;
hsqldb.lob_compressed=true;
hsqldb.script_format=3

[2022-07-03 11:50:12,66] [info] dataFileCache open start
[2022-07-03 11:50:12,67] [info] dataFileCache open end
[2022-07-03 11:52:12,34] [error] Failed to instantiate Cromwell System. Shutting down Cromwell.
java.sql.SQLTransientConnectionException: db - Connection is not available, request timed out after 120002ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:676)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:190)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:155)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100)
at slick.jdbc.hikaricp.HikariCPJdbcDataSource.createConnection(HikariCPJdbcDataSource.scala:14)
at slick.jdbc.JdbcBackend$BaseSession.<init>(JdbcBackend.scala:494)
at slick.jdbc.JdbcBackend$DatabaseDef.createSession(JdbcBackend.scala:46)
at slick.jdbc.JdbcBackend$DatabaseDef.createSession(JdbcBackend.scala:37)
at slick.basic.BasicBackend$DatabaseDef.acquireSession(BasicBackend.scala:250)
at slick.basic.BasicBackend$DatabaseDef.acquireSession$(BasicBackend.scala:249)
at slick.jdbc.JdbcBackend$DatabaseDef.acquireSession(JdbcBackend.scala:37)
at slick.basic.BasicBackend$DatabaseDef$$anon$3.run(BasicBackend.scala:275)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.base/java.lang.Thread.run(Thread.java:844)
Unhandled errors, please report this as an issue to add support for improved messages and suggestions for actions on how to resolve it
[2022-07-03 11:50:11,09] [info] Running with database db.url =
      jdbc:hsqldb:file:cromwell-executions/cromwell-db/cromwell-db;
      shutdown=false;
      hsqldb.default_table_type=cached;hsqldb.tx=mvcc;
      hsqldb.result_max_memory_rows=10000;
      hsqldb.large_data=true;
      hsqldb.applog=1;
      hsqldb.lob_compressed=true;
      hsqldb.script_format=3

[2022-07-03 11:50:12,66] [info] dataFileCache open start
[2022-07-03 11:50:12,67] [info] dataFileCache open end
[2022-07-03 11:52:12,34] [error] Failed to instantiate Cromwell System. Shutting down Cromwell.
java.sql.SQLTransientConnectionException: db - Connection is not available, request timed out after 120002ms.
    at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:676)
    at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:190)
    at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:155)
    at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100)
    at slick.jdbc.hikaricp.HikariCPJdbcDataSource.createConnection(HikariCPJdbcDataSource.scala:14)
    at slick.jdbc.JdbcBackend$BaseSession.<init>(JdbcBackend.scala:494)
    at slick.jdbc.JdbcBackend$DatabaseDef.createSession(JdbcBackend.scala:46)
    at slick.jdbc.JdbcBackend$DatabaseDef.createSession(JdbcBackend.scala:37)
    at slick.basic.BasicBackend$DatabaseDef.acquireSession(BasicBackend.scala:250)
    at slick.basic.BasicBackend$DatabaseDef.acquireSession$(BasicBackend.scala:249)
    at slick.jdbc.JdbcBackend$DatabaseDef.acquireSession(JdbcBackend.scala:37)
    at slick.basic.BasicBackend$DatabaseDef$$anon$3.run(BasicBackend.scala:275)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
    at java.base/java.lang.Thread.run(Thread.java:844)

Done in 0:02:11.756818
swarbred commented 2 years ago

2nd issue above resolved by increasing (in the cromwell conf) to

connectionTimeout = 360000

changed to

  services {
    MetadataService {
    metadata-read-row-number-safety-threshold = 4000000
    }
  }

to hopefully resolve the original issue

swarbred commented 2 years ago

My error not resolved :-( (looking at the wrong run)

swarbred commented 2 years ago

Hopefully corrected

  services {
    MetadataService {
      config {
        metadata-read-row-number-safety-threshold = 4000000
      }
    }
  }