IHTSDO / snowstorm

Scalable SNOMED CT Terminology Server using Elasticsearch
Other
208 stars 83 forks source link

HTTP 413 Timeout when attempting file import #121

Closed BillyYarosh-WellSky closed 4 years ago

BillyYarosh-WellSky commented 4 years ago

Hi team, very new to SNOWMED CT and attempting to run an import job. I run into an error on a specific concept. It looks like a rollback due to a too large request entity. Running this through docker compose currently.

snowstorm        | 2020-05-19 17:17:15.540 ERROR 1 --- [pool-2-thread-3] io.kaicode.elasticvc.api.BranchService   : Commit commitListener threw IllegalStateException, rolling back commit 1589905669159 on branch MAIN
snowstorm        |
snowstorm        | java.lang.IllegalStateException: Failed to update semantic index. Loop found in transitive closure for concept 111453004 on branch MAIN.
snowstorm        |  at org.snomed.snowstorm.core.data.services.SemanticIndexUpdateService.preCommitCompletion(SemanticIndexUpdateService.java:81)
snowstorm        |  at io.kaicode.elasticvc.api.BranchService.completeCommit(BranchService.java:378)
snowstorm        |  at io.kaicode.elasticvc.domain.Commit.close(Commit.java:58)
snowstorm        |  at org.snomed.snowstorm.core.rf2.rf2import.ImportComponentFactoryImpl.completeImportCommit(ImportComponentFactoryImpl.java:217)
snowstorm        |  at org.snomed.snowstorm.core.rf2.rf2import.FullImportComponentFactoryImpl.loadingReleaseDeltaFinished(FullImportComponentFactoryImpl.java:37)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.doLoadReleaseFiles(ReleaseImporter.java:215)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.doLoadReleaseFiles(ReleaseImporter.java:178)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.access$100(ReleaseImporter.java:164)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter.loadFullReleaseFiles(ReleaseImporter.java:40)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter.loadFullReleaseFiles(ReleaseImporter.java:62)
snowstorm        |  at org.snomed.snowstorm.core.rf2.rf2import.ImportService.importArchive(ImportService.java:122)
snowstorm        |  at org.snomed.snowstorm.core.rf2.rf2import.ImportService.lambda$importArchiveAsync$1(ImportService.java:161)
snowstorm        |  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
snowstorm        |  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
snowstorm        |  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
snowstorm        |  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
snowstorm        |  at java.lang.Thread.run(Thread.java:748)
snowstorm        | Caused by: org.snomed.snowstorm.core.data.services.transitiveclosure.GraphBuilderException: Loop found in transitive closure for concept 111453004 on branch MAIN.
snowstorm        |  at org.snomed.snowstorm.core.data.services.transitiveclosure.Node.getTransitiveClosure(Node.java:29)
snowstorm        |  at org.snomed.snowstorm.core.data.services.SemanticIndexUpdateService.updateSemanticIndex(SemanticIndexUpdateService.java:276)
snowstorm        |  at org.snomed.snowstorm.core.data.services.SemanticIndexUpdateService.updateStatedAndInferredSemanticIndex(SemanticIndexUpdateService.java:107)
snowstorm        |  at org.snomed.snowstorm.core.data.services.SemanticIndexUpdateService.preCommitCompletion(SemanticIndexUpdateService.java:79)
snowstorm        |  ... 16 common frames omitted
snowstorm        |
snowstorm        | 2020-05-19 17:17:15.545  INFO 1 --- [pool-2-thread-3] io.kaicode.elasticvc.api.BranchService   : Rolling back commit on MAIN started at 1589905669159
snowstorm        | 2020-05-19 17:21:04.002 ERROR 1 --- [pool-2-thread-3] o.s.s.core.rf2.rf2import.ImportService   : Failed RF2 FULL import on branch MAIN. ID 7b5cf660-4329-42c2-8845-db2d9e07fb74
snowstorm        |
snowstorm        | org.elasticsearch.ElasticsearchStatusException: Unable to parse response body
snowstorm        |  at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:538)
snowstorm        |  at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:441)
snowstorm        |  at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:414)
snowstorm        |  at org.springframework.data.elasticsearch.rest.ESRestHighLevelClient.performRequestAndParseEntity(ESRestHighLevelClient.java:47)
snowstorm        |  at org.elasticsearch.client.RestHighLevelClient.bulk(RestHighLevelClient.java:229)
snowstorm        |  at org.springframework.data.elasticsearch.rest.ElasticsearchRestClient.doExecute(ElasticsearchRestClient.java:208)
snowstorm        |  at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405)
snowstorm        |  at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:394)
snowstorm        |  at org.springframework.data.elasticsearch.rest.request.SingleIndexBulkRequestBuilder.execute(SingleIndexBulkRequestBuilder.java:58)
snowstorm        |  at org.springframework.data.elasticsearch.core.ElasticsearchTemplate.delete(ElasticsearchTemplate.java:772)
snowstorm        |  at io.kaicode.elasticvc.api.BranchService.rollbackCommit(BranchService.java:460)
snowstorm        |  at io.kaicode.elasticvc.api.BranchService.completeCommit(BranchService.java:384)
snowstorm        |  at io.kaicode.elasticvc.domain.Commit.close(Commit.java:58)
snowstorm        |  at org.snomed.snowstorm.core.rf2.rf2import.ImportComponentFactoryImpl.completeImportCommit(ImportComponentFactoryImpl.java:217)
snowstorm        |  at org.snomed.snowstorm.core.rf2.rf2import.FullImportComponentFactoryImpl.loadingReleaseDeltaFinished(FullImportComponentFactoryImpl.java:37)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.doLoadReleaseFiles(ReleaseImporter.java:215)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.doLoadReleaseFiles(ReleaseImporter.java:178)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter$ImportRun.access$100(ReleaseImporter.java:164)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter.loadFullReleaseFiles(ReleaseImporter.java:40)
snowstorm        |  at org.ihtsdo.otf.snomedboot.ReleaseImporter.loadFullReleaseFiles(ReleaseImporter.java:62)
snowstorm        |  at org.snomed.snowstorm.core.rf2.rf2import.ImportService.importArchive(ImportService.java:122)
snowstorm        |  at org.snomed.snowstorm.core.rf2.rf2import.ImportService.lambda$importArchiveAsync$1(ImportService.java:161)
snowstorm        |  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
snowstorm        |  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
snowstorm        |  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
snowstorm        |  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
snowstorm        |  at java.lang.Thread.run(Thread.java:748)
snowstorm        |  Suppressed: java.lang.IllegalStateException: Elasticsearch didn't return the [Content-Type] header, unable to parse response body
snowstorm        |      at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:551)
snowstorm        |      at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:534)
snowstorm        |      ... 26 common frames omitted
snowstorm        | Caused by: org.elasticsearch.client.ResponseException: method [POST], host [http://es:9200], URI [/member/_bulk?timeout=1m], status line [HTTP/1.1 413 Request Entity Too Large]
snowstorm        |
snowstorm        |  at org.elasticsearch.client.RestClient$1.completed(RestClient.java:355)
snowstorm        |  at org.elasticsearch.client.RestClient$1.completed(RestClient.java:344)
snowstorm        |  at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:123)
snowstorm        |  at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:181)
snowstorm        |  at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:439)
snowstorm        |  at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:329)
snowstorm        |  at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
snowstorm        |  at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
snowstorm        |  at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
snowstorm        |  at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
snowstorm        |  at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
snowstorm        |  at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
snowstorm        |  at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
snowstorm        |  at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
snowstorm        |  at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
snowstorm        |  at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
snowstorm        |  ... 1 common frames omitted
rorydavidson commented 4 years ago

Hi @BillyYarosh-WellSky. Could you please tell us what version/edition you are trying to import?

BillyYarosh-WellSky commented 4 years ago

The most recent from this month. SnomedCT_USEditionRF2_PRODUCTION_20200301T120000Z

Adding some of the backtrace as well.

snowstorm        | 2020-05-19 17:16:58.902  WARN 1 --- [/O dispatcher 1] org.elasticsearch.client.RestClient      : request [GET http://es:9200/concept/concept/_search?typed_keys=true&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&search_type=dfs_query_then_fetch&batched_reduce_size=512] returned 1 warnings: [299 Elasticsearch-6.8.3-0c48c0e "Deprecated: the number of terms [290354] used in the Terms Query request has exceeded the allowed maximum of [65536]. This maximum can be set by changing the [index.max_terms_count] index level setting."]
elasticsearch    | [2020-05-19T17:16:58,999][WARN ][o.e.d.i.q.TermsQueryBuilder] [3fo_RQU] Deprecated: the number of terms [290354] used in the Terms Query request has exceeded the allowed maximum of [65536]. This maximum can be set by changing the [index.max_terms_count] index level setting.
snowstorm        | 2020-05-19 17:16:59.912  WARN 1 --- [/O dispatcher 1] org.elasticsearch.client.RestClient      : request [GET http://es:9200/concept/concept/_search?typed_keys=true&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&scroll=60000ms&search_type=query_then_fetch&batched_reduce_size=512] returned 1 warnings: [299 Elasticsearch-6.8.3-0c48c0e "Deprecated: the number of terms [290354] used in the Terms Query request has exceeded the allowed maximum of [65536]. This maximum can be set by changing the [index.max_terms_count] index level setting."]
snowstorm        | 2020-05-19 17:17:15.281  WARN 1 --- [pool-2-thread-3] o.s.s.c.d.s.SemanticIndexUpdateService   : The following concepts have been referred to in relationships but are missing or inactive: {157265008, 2837008, 63479002, 329497002, 65502005, 158094009, 156073000, 268460000, 238951005, 52988006, 302835009, 149213005, 193462001, 133858001, 113218004, 230456007, 187633009, 270516002, 240197007, 253549006, 106763003, 81701001, 150062003, 107280007, 234975001, 109993000, 156072005, 203045001, 128623006, 203082005, 255637000, 333032003, 201101007, 161891005, 239107007, 195042002, 155441006, 200722003, 150617003, 11953005, 195742007, 284009009, 54819005, 21267007, 282831004, 273945008, 154283005, 273951003, 1033009, 255632006, 234350007, 354034000, 366979004, 324936002, 241047004, 156370009, 211364004, 193982009, 109501002, 123250005, 155597006, 105472004, 156410002, 17775009, 122456005, 273948005, 154826009, 134209002}
elasticsearch    | [2020-05-19T17:17:15,458][WARN ][o.e.d.i.q.TermsQueryBuilder] [3fo_RQU] Deprecated: the number of terms [287731] used in the Terms Query request has exceeded the allowed maximum of [65536]. This maximum can be set by changing the [index.max_terms_count] index level setting.
snowstorm        | 2020-05-19 17:17:15.526  WARN 1 --- [/O dispatcher 1] org.elasticsearch.client.RestClient      : request [GET http://es:9200/semantic/queryconcept/_search?typed_keys=true&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&scroll=60000ms&search_type=query_then_fetch&batched_reduce_size=512] returned 1 warnings: [299 Elasticsearch-6.8.3-0c48c0e "Deprecated: the number of terms [287731] used in the Terms Query request has exceeded the allowed maximum of [65536]. This maximum can be set by changing the [index.max_terms_count] index level setting."]
snowstorm        | 2020-05-19 17:17:15.528  INFO 1 --- [pool-2-thread-3] o.snomed.snowstorm.core.util.TimerUtil   : Timer TC index inferred: Collect existingDescendants from QueryConcept. took 17.577 seconds
rorydavidson commented 4 years ago

Thanks... and what was the import config that you used ?

BillyYarosh-WellSky commented 4 years ago

POST /imports/

{
  "branchPath": "MAIN",
  "createCodeSystemVersion": true,
  "type": "FULL"
}

POST /imports/{id}/archive.

I will note that there was also a lock on MAIN I had to remove. I had a bad file fail the day before. Destroyed the environment and rebuilt. After rebuilding I was surprised to see a lock. I assumed it was some sort of state issue and cleared the lock. This error occurred after importing again. It got pretty far along before failing.

kaicode commented 4 years ago

Hi @BillyYarosh-WellSky, thanks for reaching out.

It sounds like the deployment is in an inconsistent state. The MAIN branch will not have a lock if the environment is clean, it sounds like the indices were kept in your Elasticsearch cluster.

I recommend dropping and recreating the indicies before starting a new clean import. Use the argument --delete-indices when starting Snowstorm to have it do this. See https://github.com/IHTSDO/snowstorm/blob/master/docs/loading-snomed.md#command-line be careful not to leave that flag hanging around! 😄

Please do let us know how you get on. Cheers.

BillyYarosh-WellSky commented 4 years ago

OK, I'll take a look at this again today. Will report back.

BillyYarosh-WellSky commented 4 years ago

By deleting and re-creating indices, I was able to complete an import.