hapifhir / hapi-fhir

🔥 HAPI FHIR - Java API for HL7 FHIR Clients and Servers
http://hapifhir.io
Apache License 2.0
1.99k stars 1.31k forks source link

Validation performance degraded #6091

Open mkaehlershs opened 2 months ago

mkaehlershs commented 2 months ago

Hello, our team noticed, that since v7.0.0 and later the performance of the validation degraded heavily.

When posting the attached bundle to hapi v.6.10.1 the execution time is ~7 seconds. If the bundle is posted a second time it takes around 4 second. Posting it again takes less than a second.

When posting the bundle to hapi v.7.2.0 the execution time is ~20 seconds. If the bundle is posted a second time it takes ~13 seconds. Posting it again still takes ~13 seconds.

If we switch off validation it takes less than a second. So this indicates, that the bad performance comes from the validation. The attached bundle contains 100 patient resources for us-core-patient profile. If we increase the number of patients from 100 to 200 the execution time changes as follows:

To Reproduce Steps to reproduce the behavior:

  1. Use HAPI docker image version hapi:v7.2.0
  2. configure application.yaml file to load hl7_fhir_us_core package in version 4.0.0
  3. post attached bundle
  4. execution time bundle.json

Environment

nigtrifork commented 2 weeks ago

Hello

We have seen the same (or similar) issue, and our hypothesis is that it is caused by a problem with caching in org.hl7.fhir.common.hapi.validation.support.CachingValidationSupport#loadFromCache.

In commit c8d6e9fb , empty query results are evicted from the cache. This was added between 6.10.1 and 7.2.0. As far as we can deduce, this means that calls to the method isValueSet in org.hl7.fhir.validation.instance.InstanceValidator with an URL which is not a ValueSet URL, will result in a database lookup every time, since the result is not cached.

As a workaround, we have made a classpath override of org.hl7.fhir.validation.instance.InstanceValidator, where we cache the result of isValueSet.