IHTSDO / snowstorm

Scalable SNOMED CT Terminology Server using Elasticsearch
Other
208 stars 83 forks source link

Changed behaviour with 7.5.4 (ECL-cache bug when combining ecl and conceptIds parameters) #358

Closed peterdutey closed 2 years ago

peterdutey commented 2 years ago

Hello

I've been looking at release 7.5.4 and one of my regression tests went off.

I've been using findConcepts quite a lot in the past by using combinations of conceptIds and ecl to determine whether a concept belongs to a set of concepts returned by ecl. This has proven useful to determine whether a concept is a subtype of another or not.

Now things have changed with 7.5.4 and and I'm not too sure why (MAINT-1790?) Please see example below.

Many thanks in advance for any pointers and a happy new year! Peter!

Here's an example of testing whether 233604007|Pneumonia| is a subtype of 404684003|Clinical finding|

On version 7.1.2

(at time of writing - I hope they don't update too soon) the combination below only returned a single concept and all is good.

https://snomednz.digital.health.nz/MAIN/concepts?ecl=%3C404684003&conceptIds=233604007&offset=0&limit=50

{
  "items" : [ {
    "conceptId" : "233604007",
    "active" : true,
    "definitionStatus" : "FULLY_DEFINED",
    "moduleId" : "900000000000207008",
    "effectiveTime" : "20150131",
    "fsn" : {
      "term" : "Pneumonia (disorder)",
      "lang" : "en"
    },
    "pt" : {
      "term" : "Pneumonia",
      "lang" : "en"
    },
    "id" : "233604007",
    "idAndFsnTerm" : "233604007 | Pneumonia (disorder) |"
  } ],
  "total" : 1,
  "limit" : 50,
  "offset" : 0,
  "searchAfter" : "WzIzMzYwNDAwN10=",
  "searchAfterArray" : [ 233604007 ]
}

Alternative endpoint - version 6.2.1 at time of writing

https://snowstorm.msal.gov.ar/MAIN/concepts?ecl=%3C404684003&conceptIds=233604007&offset=0&limit=50

{
  "items" : [ {
    "conceptId" : "233604007",
    "active" : true,
    "definitionStatus" : "FULLY_DEFINED",
    "moduleId" : "900000000000207008",
    "effectiveTime" : "20150131",
    "fsn" : {
      "term" : "Pneumonia (disorder)",
      "lang" : "en"
    },
    "pt" : {
      "term" : "Pneumonia",
      "lang" : "en"
    },
    "id" : "233604007"
  } ],
  "total" : 1,
  "limit" : 50,
  "offset" : 0,
  "searchAfter" : "WzIzMzYwNDAwN10=",
  "searchAfterArray" : [ 233604007 ]
}

On 7.5.4 it is failing

https://browser.ihtsdotools.org/snowstorm/snomed-ct/MAIN/concepts?ecl=%3C404684003&conceptIds=233604007&offset=0&limit=50

<Map>
<error>INTERNAL_SERVER_ERROR</error>
<message>
Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]; nested exception is ElasticsearchStatusException[Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Result window is too large, from + size must be less than or equal to: [10000] but was [65000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Result window is too large, from + size must be less than or equal to: [10000] but was [65000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.]];
</message>
</Map>

Sometimes works with raising the limit to 9000... sometimes!

https://browser.ihtsdotools.org/snowstorm/snomed-ct/MAIN/concepts?ecl=%3C404684003&conceptIds=233604007&offset=0&limit=9000

{
  "items" : [ {
    "conceptId" : "233604007",
    "active" : true,
    "definitionStatus" : "FULLY_DEFINED",
    "moduleId" : "900000000000207008",
    "effectiveTime" : "20150131",
    "fsn" : {
      "term" : "Pneumonia (disorder)",
      "lang" : "en"
    },
    "pt" : {
      "term" : "Pneumonia",
      "lang" : "en"
    },
    "id" : "233604007",
    "idAndFsnTerm" : "233604007 | Pneumonia (disorder) |"
  } ],
  "total" : 1,
  "limit" : 9000,
  "offset" : 0,
  "searchAfter" : "WzIzMzYwNDAwN10=",
  "searchAfterArray" : [ 233604007 ]
}
kaicode commented 2 years ago

Hi @peterdutey, happy new year to you!

I'm sorry you are seeing an issue with the latest Snowstorm release, thank you for finding and reporting this.

I confirm that this is a bug. It is related to the new ECL cache feature. I am finding that instances with the ECL cache disabled do not have this issue. The ECL cache can be disabled via configuration as a workaround until this is fixed:

cache.ecl.enabled=false

Ref https://github.com/IHTSDO/snowstorm/blob/7.5.4/src/main/resources/application.properties#L134

We will get this fixed and commit against this bug ticket.

peterdutey commented 2 years ago

Hi @kaicode Many thanks for explaining! I'll be keeping an eye on the fix. Thank you Peter

kaicode commented 2 years ago

This is fixed in the latest release (7.6.0).