Open comorbidity opened 2 years ago
Just to rule out the obvious, do you have docker memory limits configured very low? And is this the first time running or have you had success with it before?
On Fri, Sep 23, 2022, 4:11 PM AndyMc @.***> wrote:
From localmachine with 32GB memory, Mac M1 $ uname -a Darwin 21.6.0 Darwin Kernel Version 21.6.0: Wed Aug 10 14:28:23 PDT 2022; root:xnu-8020.141.5~2/RELEASE_ARM64_T6000 arm64
`23 Sep 2022 20:08:11 INFO UmlsUserApprover - UMLS Account has been validated
23 Sep 2022 20:08:11 INFO JdbcConnectionFactory - Connecting to jdbc:hsqldb:file:org/apache/ctakes/dictionary/lookup/fast/snorx_2021aa/snorx_2021aa: 23 Sep 2022 20:08:11 INFO ENGINE - open start - state not modified ..............................................23 Sep 2022 20:08:26 FATAL ENGINE - readExistingData failed 849823 java.lang.OutOfMemoryError: GC overhead limit exceeded at org.hsqldb.RowAVL.setNewNodes(Unknown Source) at org.hsqldb.RowAVL.(Unknown Source) at org.hsqldb.persist.RowStoreAVLMemory.getNewCachedObject(Unknown Source) at org.hsqldb.Table.insertData(Unknown Source) at org.hsqldb.Table.insertFromScript(Unknown Source) at org.hsqldb.scriptio.ScriptReaderText.readExistingData(Unknown Source) at org.hsqldb.scriptio.ScriptReaderBase.readAll(Unknown Source) at org.hsqldb.persist.Log.processScript(Unknown Source) at org.hsqldb.persist.Log.open(Unknown Source) at org.hsqldb.persist.Logger.open(Unknown Source) at org.hsqldb.Database.reopen(Unknown Source) at org.hsqldb.Database.open(Unknown Source) at org.hsqldb.DatabaseManager.getDatabase(Unknown Source) at org.hsqldb.DatabaseManager.newSession(Unknown Source) at org.hsqldb.jdbc.JDBCConnection.(Unknown Source) at org.hsqldb.jdbc.JDBCDriver.getConnection(Unknown Source) at org.hsqldb.jdbc.JDBCDriver.connect(Unknown Source) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:247) at org.apache.ctakes.dictionary.lookup2.util.JdbcConnectionFactory.getConnection(JdbcConnectionFactory.java:85) at org.apache.ctakes.dictionary.lookup2.dictionary.JdbcRareWordDictionary.(JdbcRareWordDictionary.java:91) at org.apache.ctakes.dictionary.lookup2.dictionary.JdbcRareWordDictionary.(JdbcRareWordDictionary.java:72) at org.apache.ctakes.dictionary.lookup2.dictionary.UmlsJdbcRareWordDictionary.(UmlsJdbcRareWordDictionary.java:31) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorParser.parseDictionary(DictionaryDescriptorParser.java:195) at org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorParser.parseDictionaries(DictionaryDescriptorParser.java:155) at org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorParser.parseDescriptor(DictionaryDescriptorParser.java:127) at org.apache.ctakes.dictionary.lookup2.ae.AbstractJCasTermAnnotator.initialize(AbstractJCasTermAnnotator.java:137) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:267) .23 Sep 2022 20:08:26 WARN ENGINE - Script processing failure org.hsqldb.HsqlException: error in script file line: 849823 java.lang.OutOfMemoryError: GC overhead limit exceeded `
— Reply to this email directly, view it on GitHub https://github.com/Machine-Learning-for-Medical-Language/ctakes-covid-container/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABXODLPEN7VFQFNG72TPITV7YFHZANCNFSM6AAAAAAQUI2Z4E . You are receiving this because you are subscribed to this thread.Message ID: <Machine-Learning-for-Medical-Language/ctakes-covid-container/issues/1 @github.com>
Problem is Java on Mac M1 (ARM) is a problem for this container. Only solution is to use a different JVM, which is either a lot of work or a tiny fix.
This sounds like a different error than I remember you hitting on the M1. Do we get further along these days, or was this always the issue?
I had a pet theory about the M1 issue we had been hitting, but if it really is a memory limit issue, I don't think my theory is correct. But here it is anyway:
It might be worth trying to run an amd64
x86 container on the M1 rather than an arm64
container. If you build the docker image manually, you'd get the native arm64
by default, and maybe its ancient JVM has issues on the M1. But if you just try the smart-on-fhir/ctakes-covid
container, which only comes in an amd64
variant, the x86 emulation layer might be better than an ancient JVM trying native arm code but not expecting the new M1 chip.
Anyway, that's something to try: run an x86 container by installing smart-on-fhir/ctakes-covid
from docker hub and seeing if that helps or hurts things.
From @mikix
OK for the M1 here's some guesses based on some detective work:
Docker on M1 will prefer the native architecture (
arm64
or in docker termslinux/arm64/v8
for linux images) And surprisingly, the openjdk used in our current cTAKES builds does support that architecture!But it's 3 years old, and maybe there's a compatibility issue with M1. I see some JDK vendors talk about especially adding support for M1 in their JDK, so maybe there's more to it than simply building for
linux/arm64/v8
But docker doesn't know there's a compatibility issue. It sees your arm64 architecture, and grabs thelinux/arm64/v8
jdk and builds from that. But! It looks like there is a workaround. Just tell docker to useamd64
anyway (M1 can apparently runamd64
code/dockers in emulation mode, but it's slower) So you have two options. Build like so:docker build --platform linux/amd64 -t ctakes-covid
... (note the --platform linux/amd64 argument to force that version)
Or just try pulling down the image that Jamie made, which doesn't even offer an arm64 version: docker pull smartonfhir/ctakes-covid
and that should result in an amd64
version that you can run with docker run smartonfhir/ctakes-covid ..., albeit slowly
Not sure my detective work is right, but that might be the shape of it
From localmachine with 32GB memory, Mac M1 $ uname -a Darwin 21.6.0 Darwin Kernel Version 21.6.0: Wed Aug 10 14:28:23 PDT 2022; root:xnu-8020.141.5~2/RELEASE_ARM64_T6000 arm64
`23 Sep 2022 20:08:11 INFO UmlsUserApprover - UMLS Account has been validated
23 Sep 2022 20:08:11 INFO JdbcConnectionFactory - Connecting to jdbc:hsqldb:file:org/apache/ctakes/dictionary/lookup/fast/snorx_2021aa/snorx_2021aa: 23 Sep 2022 20:08:11 INFO ENGINE - open start - state not modified ..............................................23 Sep 2022 20:08:26 FATAL ENGINE - readExistingData failed 849823 java.lang.OutOfMemoryError: GC overhead limit exceeded at org.hsqldb.RowAVL.setNewNodes(Unknown Source) at org.hsqldb.RowAVL.(Unknown Source)
at org.hsqldb.persist.RowStoreAVLMemory.getNewCachedObject(Unknown Source)
at org.hsqldb.Table.insertData(Unknown Source)
at org.hsqldb.Table.insertFromScript(Unknown Source)
at org.hsqldb.scriptio.ScriptReaderText.readExistingData(Unknown Source)
at org.hsqldb.scriptio.ScriptReaderBase.readAll(Unknown Source)
at org.hsqldb.persist.Log.processScript(Unknown Source)
at org.hsqldb.persist.Log.open(Unknown Source)
at org.hsqldb.persist.Logger.open(Unknown Source)
at org.hsqldb.Database.reopen(Unknown Source)
at org.hsqldb.Database.open(Unknown Source)
at org.hsqldb.DatabaseManager.getDatabase(Unknown Source)
at org.hsqldb.DatabaseManager.newSession(Unknown Source)
at org.hsqldb.jdbc.JDBCConnection.(Unknown Source)
at org.hsqldb.jdbc.JDBCDriver.getConnection(Unknown Source)
at org.hsqldb.jdbc.JDBCDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at org.apache.ctakes.dictionary.lookup2.util.JdbcConnectionFactory.getConnection(JdbcConnectionFactory.java:85)
at org.apache.ctakes.dictionary.lookup2.dictionary.JdbcRareWordDictionary.(JdbcRareWordDictionary.java:91)
at org.apache.ctakes.dictionary.lookup2.dictionary.JdbcRareWordDictionary.(JdbcRareWordDictionary.java:72)
at org.apache.ctakes.dictionary.lookup2.dictionary.UmlsJdbcRareWordDictionary.(UmlsJdbcRareWordDictionary.java:31)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorParser.parseDictionary(DictionaryDescriptorParser.java:195)
at org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorParser.parseDictionaries(DictionaryDescriptorParser.java:155)
at org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorParser.parseDescriptor(DictionaryDescriptorParser.java:127)
at org.apache.ctakes.dictionary.lookup2.ae.AbstractJCasTermAnnotator.initialize(AbstractJCasTermAnnotator.java:137)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:267)
.23 Sep 2022 20:08:26 WARN ENGINE - Script processing failure
org.hsqldb.HsqlException: error in script file line: 849823 java.lang.OutOfMemoryError: GC overhead limit exceeded
`