IHTSDO / snowstorm

Scalable SNOMED CT Terminology Server using Elasticsearch
Other
205 stars 80 forks source link

Loading data: Failed UK Monolith RF2 SNAPSHOT import on branch MAIN (Resolved) #600

Open Dylan-c-93 opened 3 months ago

Dylan-c-93 commented 3 months ago

Hi,

I'm having issues when running the first load of data into snowstorm.

Using: sudo java -Xms8g -Xmx16g snowstorm-jar.10.3.1.jar --delete-indices --debug --elasticsearch.index.max.terms.count=1000000 --import=/home/ubuntu/snowstorm/data/monolith.zip

The load errors on "o.s.s.core.rf2.rf2import.ImportService : Failed RF2 SNAPSHOT import on branch MAIN".

I've tried also following the guidance helpfully shared here (just updating to use Elastic 8.11.1, the latest version of snowstorm, and more memory as the load was crashing at the query concept stage), and can't see where the issue creeps in. I think it's likely user error rather than a bug, but I thought it was worth checking here anyway.

I've attached the debug output. Snowstorm_Evaluation_Report.txt

Update:

Thanks in advance.

Error Message: org.springframework.data.elasticsearch.UncategorizedElasticsearchException: [es/search] failed: [search_phase_execution_exception] all shards failed at org.springframework.data.elasticsearch.client.elc.ElasticsearchExceptionTranslator.translateExceptionIfPossible(ElasticsearchExceptionTranslator.java:105) at org.springframework.data.elasticsearch.client.elc.ElasticsearchExceptionTranslator.translateException(ElasticsearchExceptionTranslator.java:64) at org.springframework.data.elasticsearch.client.elc.ElasticsearchTemplate.execute(ElasticsearchTemplate.java:635) at org.springframework.data.elasticsearch.client.elc.ElasticsearchTemplate.searchScrollStart(ElasticsearchTemplate.java:395) at org.springframework.data.elasticsearch.core.AbstractElasticsearchTemplate.searchForStream(AbstractElasticsearchTemplate.java:426) at org.springframework.data.elasticsearch.core.AbstractElasticsearchTemplate.searchForStream(AbstractElasticsearchTemplate.java:413) at org.snomed.snowstorm.core.data.services.SemanticIndexUpdateService.buildRelevantPartsOfExistingGraph(SemanticIndexUpdateService.java:549) at org.snomed.snowstorm.core.data.services.SemanticIndexUpdateService.updateSemanticIndex(SemanticIndexUpdateService.java:206) at org.snomed.snowstorm.core.data.services.SemanticIndexUpdateService.updateStatedAndInferredSemanticIndex(SemanticIndexUpdateService.java:131) at org.snomed.snowstorm.core.data.services.SemanticIndexUpdateService.preCommitCompletion(SemanticIndexUpdateService.java:95) at io.kaicode.elasticvc.api.BranchService.completeCommit(BranchService.java:416) at io.kaicode.elasticvc.domain.Commit.close(Commit.java:61) at org.snomed.snowstorm.core.rf2.rf2import.ImportComponentFactoryImpl.completeImportCommit(ImportComponentFactoryImpl.java:231) at org.snomed.snowstorm.core.rf2.rf2import.ImportComponentFactoryImpl.loadingComponentsCompleted(ImportComponentFactoryImpl.java:220) at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.doLoadReleaseFiles(ReleaseImporter.java:251) at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.doLoadReleaseFiles(ReleaseImporter.java:203) at org.ihtsdo.otf.snomedboot.ReleaseImporter.loadSnapshotReleaseFiles(ReleaseImporter.java:46) at org.ihtsdo.otf.snomedboot.ReleaseImporter.loadSnapshotReleaseFiles(ReleaseImporter.java:80) at org.snomed.snowstorm.core.rf2.rf2import.ImportService.snapshotImport(ImportService.java:233) at org.snomed.snowstorm.core.rf2.rf2import.ImportService.importFiles(ImportService.java:180) at org.snomed.snowstorm.core.rf2.rf2import.ImportService.importArchive(ImportService.java:133) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:352) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:713) at org.snomed.snowstorm.core.rf2.rf2import.ImportService$$SpringCGLIB$$0.importArchive(<generated>) at org.snomed.snowstorm.SnowstormApplication.importEditionRF2FromDisk(SnowstormApplication.java:150) at org.snomed.snowstorm.SnowstormApplication.run(SnowstormApplication.java:110) at org.springframework.boot.SpringApplication.lambda$callRunner$4(SpringApplication.java:794) at org.springframework.util.function.ThrowingConsumer$1.acceptWithException(ThrowingConsumer.java:83) at org.springframework.util.function.ThrowingConsumer.accept(ThrowingConsumer.java:60) at org.springframework.util.function.ThrowingConsumer$1.accept(ThrowingConsumer.java:88) at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:806) at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:794) at org.springframework.boot.SpringApplication.lambda$callRunners$3(SpringApplication.java:782) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:510) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:782) at org.springframework.boot.SpringApplication.run(SpringApplication.java:341) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1358) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1347) at org.snomed.snowstorm.SnowstormApplication.main(SnowstormApplication.java:63) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:91) at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:53) at org.springframework.boot.loader.launch.JarLauncher.main(JarLauncher.java:58) Caused by: co.elastic.clients.elasticsearch._types.ElasticsearchException: [es/search] failed: [search_phase_execution_exception] all shards failed at co.elastic.clients.transport.ElasticsearchTransportBase.getApiResponse(ElasticsearchTransportBase.java:345) at co.elastic.clients.transport.ElasticsearchTransportBase.performRequest(ElasticsearchTransportBase.java:147) at co.elastic.clients.elasticsearch.ElasticsearchClient.search(ElasticsearchClient.java:1897) at org.springframework.data.elasticsearch.client.elc.ElasticsearchTemplate.lambda$searchScrollStart$17(ElasticsearchTemplate.java:395) at org.springframework.data.elasticsearch.client.elc.ElasticsearchTemplate.execute(ElasticsearchTemplate.java:633) ... 54 common frames omitted

kaicode commented 3 months ago

Thanks for reporting this.

Has anyone else in the community tried importing the UK Monolith package on the latest Snowstorm and Elasticsearch 8?

mohammada-huma commented 1 month ago

Hi, yes I did the same and it failed with the same error.

kaicode commented 1 month ago

The "all shards failed" message points towards an Elasticsearch issue. Errors and Warnings from the Elasticsearch log are needed to help debug this.

mohammada-huma commented 1 month ago

I don't have access now, but I tested the same dataset with old v7.2.0 and it passed that phase without any errors. I think it's great if you can check UK monolith 38.4.0 with the latest docker-compose from the repo.

kaicode commented 1 month ago

I attempted to reproduce the issue. In my case after about an hour of loading the monolith package the Elasticsearch dies with this error:

2024-08-12 18:51:00 ERROR: Elasticsearch exited unexpectedly, with exit code 137

This indicates that Elasticsearch ran out of memory. It's not clear if this error is due to the 5g memory limit on the Elasticsearch container in the docker compose file or the 6g overall Docker limit I have set on my machine.

Screenshot 2024-08-13 at 14 56 38

I will just increase the overall Docker limit to 8g, without changing the Elasticsearch container limit, and run again.

mohammada-huma commented 1 month ago

I tried with 6gb elasticsearch in docker and server with -xmx8g [running snowstorm from local IDE to connect to elastic in docker], aka 14gb total and I still received out of memory for snowstorm.

In the end I succeeded to import data with 26gb total docker memory [elastic+snowstorm] on these versions:

Settings: java -Xms2g -Xmx16g --add-opens java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED -cp @/app/jib-classpath-file org.snomed.snowstorm.SnowstormApplication --elasticsearch.urls=http://es:9200 --elasticsearch.index.max.terms.count=2147483647

kaicode commented 1 month ago

Without setting the minimum memory it's hard to tell how much memory was actually used by that Snowstorm process when it ran out. With -Xms2g -Xmx16g we know that it had 2g because otherwise the jvm would have failed to start the process, but unless there was another 14g free on the machine it may not have been able to reserve much more. Also running in debug from an IDE can cause much more memory to be used. Do you happen to know how much memory Snowstorm actually used before running out when importing the UK monolith package?

kaicode commented 1 month ago

I got the same issue with Elasticsearch exiting with out of memory exit code when Elastic had 5g Docker 8g. Increasing memory for another run...

mohammada-huma commented 1 month ago

I cannot say with certainty, all I know is that 8gb for snowstorm alone was not enough for me, so minimum should be something more than 8gb for uk monolith.

For elastic, I used this setting: "ES_JAVA_OPTS=-Xms4g -Xmx8g" which worked for me. The issue is that I don't own a 32gb system myself, and I borrowed it, so I cannot test with that anymore.

kaicode commented 1 month ago

My latest import of the UK Monolith package completed.

Docker Compose changes:

services:
  elasticsearch:
  ...
    environment:
    ...
      - "ES_JAVA_OPTS=-Xms6g -Xmx6g"
    ...
    mem_reservation: 6g

  snowstorm:
  ...
    entrypoint: java -Xms4g -Xmx4g --add-opens java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED -cp @/app/jib-classpath-file org.snomed.snowstorm.SnowstormApplication --elasticsearch.urls=http://es:9200 --elasticsearch.index.max.terms.count=1000000
  ...
    mem_reservation: 4g

Log:

snowstorm      | 2024-08-13T17:02:21.492Z  INFO 1 --- [pool-3-thread-1] o.s.s.core.rf2.rf2import.ImportService   : Completed RF2 SNAPSHOT import on branch MAIN in 4788 seconds. ID 07a3e926-6636-4b2b-a975-d30a6ba99c7f

It seems to be okay with a bit more memory.

Once the import has completed the memory requirements can be reduced again. 2-4g for Elastic and the same for Snowstorm should be plenty.