hapifhir / hapi-fhir

🔥 HAPI FHIR - Java API for HL7 FHIR Clients and Servers
http://hapifhir.io
Apache License 2.0
2.02k stars 1.32k forks source link

Deadlock in case of concurrent users #1819

Open poojgupt opened 4 years ago

poojgupt commented 4 years ago

Hello,

I recently started using FHIR server and was trying to ingest 10 resources in a bundle, Number of Users = 5, Loop count = 50 thru Jmeter and started seeing Deadlock

deadlock

Exception on the console.. on analyzing further deadlock was happening on HFJ_Forced_ID table where it checks if the ResourceType + Forced_ID already exists or not. If not, then it marks the resources for creation.

All my resources, in the bundle, use PUT method and use its own ID (logical identifier). Had gone thru the Google Group as well, and it was suggested to use READ_UNCOMMITTED isolation level. Just wondering if any other way out exists?

Any help/suggestions/feedback.

Had already gone thru this https://groups.google.com/forum/#!msg/hapi-fhir/lLuRnQZoaU8/0hl9Qg7WAgAJ

Just wondering if there is any other solution?

jamesagnew commented 4 years ago

Can you try this with current snapshot builds of HAPI FHIR 5.0.0 to see if this is resolved?

poojgupt commented 4 years ago

Yes, i did try with 5.0.-SNAPSHOT today and no relief there.. still getting below exception

Failed to call access method: org.springframework.dao.CannotAcquireLockException: could not execute query; SQL [select forcedid0_.RESOURCE_PID as col_00 from HFJ_FORCEDID forcedid0 where forcedid0_.RESOURCETYPE=? and (forcedid0.FORCED_ID in (?))]; nested exception is org.hibernate.exception.LockAcquisitionException: could not execute query...

I was trying to insert 2500 resources in total and with 5.0.0-SNAPSHOT i could see 710 resources getting ingested and with 4.2.0 - 750 resources in SQL server.

jamesagnew commented 4 years ago

Are you able to work out what is the minimal set of input data required in order to reproduce this?

poojgupt commented 4 years ago

Attaching the Jmeter script to reproduce the same. I am simply running the script on my Windows Laptop + SQL Server 2014 + Using hapi-fhir-jpaserver-starter module [4.2.0]

jmeter_fhir-10Resources-PUT-generateIDs.zip

The script is testing for 5 concurrent users and 50 times is the loop count for 1 user which means at the end of the script I should be seeing 2500 resources entries.

Configuration done is hapi.properties to connect to SQL Server datasource.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver datasource.url=jdbc:sqlserver://localhost;databaseName=fhir datasource.username=sa datasource.password=admin hibernate.dialect=org.hibernate.dialect.SQLServer2012Dialect

Lucene is disbled and default connection pool settings. No other changes done in the code.

I am sure we should be able to reproduce this easily.

jamesagnew commented 4 years ago

FYI- I don't actually use JMeter and don't have an environment to run this in.. If you're able to break this down to "uploading a bundle containing X at the same time as a second bundle containing Y will trigger this" that definitely improves the changes of this being looked at in the foreseeable future

On Thu, Apr 30, 2020 at 8:48 AM poojgupt notifications@github.com wrote:

Attaching the Jmeter script to reproduce the same. I am simply running the script on my Windows Laptop + SQL Server 2014 + Using hapi-fhir-jpaserver-starter module [4.2.0]

jmeter_fhir-10Resources-PUT-generateIDs.zip https://github.com/jamesagnew/hapi-fhir/files/4558328/jmeter_fhir-10Resources-PUT-generateIDs.zip

The script is testing for 5 concurrent users and 50 times is the loop count for 1 user which means at the end of the script I should be seeing 2500 resources entries.

Configuration done is hapi.properties to connect to SQL Server datasource.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver datasource.url=jdbc:sqlserver://localhost;databaseName=fhir datasource.username=sa datasource.password=admin hibernate.dialect=org.hibernate.dialect.SQLServer2012Dialect

Lucene is disbled and default connection pool settings. No other changes done in the code.

I am sure we should be able to reproduce this easily.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jamesagnew/hapi-fhir/issues/1819#issuecomment-621812023, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2N7HOZNIFREII52NM7CCLRPFXQBANCNFSM4MSZ7PWQ .

poojgupt commented 4 years ago

Yes the understanding on this issue is perfectly right. Uploading a bundle (with HTTP request method PUT, not sure about POST did not try) with X entries and second bundle with Y entries causing the deadlock with concurrent threads.

jamesagnew commented 4 years ago

I'm sorry, maybe I'm not being clear. I don't have any understanding, that's what I'm saying.

I am sure that some permutation of X+Y causes an issue, but I'm asking if you can figure out what is the minimum thing that X and Y need to be in order to reproduce this.

We have loads of unit and integration tests that fire tons of data concurrently in all kinds of ways so there is no fundamental issue where HAPI can't handle concurrent operations. I would need to know what it specific to your use case that triggers this.

poojgupt commented 4 years ago

Sorry about the confusion. This is the data [10 resources in a bundle] that I am uploading using doing HTTP post from the client.

request.data.zip

To reproduce this issue, if we can upload the attached data with 5 concurrent users and each user ingesting the data 50 times which means 1 user = 500 records/resources in the DB multiplied by 5 users = 500 x 5 = 2500 records.

some extra information may be helpful understanding the request data

If we open this data xml, all the data primarily would remain same except the logical identifiers of the resources. The logical identifiers are dynamically generated for each run.

<Patient xmlns="http://hl7.org/fhir">
            <id value="**${PATIENT_ID}**" />

e.g. in the above snippet of the request data. PATIENT_ID would be generated at run time dynamically and this same ID we are concatenating in other IDs needed for other resources in the bundle. e.g. for Observation its ID will be 39156-

        <id value=**"39156-${PATIENT_ID}"** />

This is the logic that we are using to generate the identifiers hence if we can generate just the PATIENT_ID dynamically.

jamesagnew commented 4 years ago

Are those patient IDs guaranteed to be unique for each individual upload, or is there a chance that multiple concurrent uploads are uploading the same Patient ID?

Also, does removing all of the the ifNoneExist sections from the elements have any effect?

On Fri, May 1, 2020 at 12:16 AM poojgupt notifications@github.com wrote:

Sorry about the confusion. This is the data [10 resources in a bundle] that I am uploading using doing HTTP post from the client.

request.data.zip https://github.com/jamesagnew/hapi-fhir/files/4562452/request.data.zip

To reproduce this issue, if we can upload the attached data with 5 concurrent users and each user ingesting the data 50 times which means 1 user = 500 records/resources in the DB multiplied by 5 users = 500 x 5 = 2500 records.

some extra information may be helpful understanding the request data

If we open this data xml, all the data primarily would remain same except the logical identifiers of the resources. The logical identifiers are dynamically generated for each run.

e.g. in the above snippet of the request data. PATIENT_ID would be generated at run time dynamically and this same ID we are concatenating in other IDs needed for other resources in the bundle. e.g. for Observation its ID will be 39156- This is the logic that we are using to generate the identifiers hence if we can generate just the PATIENT_ID dynamically. — You are receiving this because you commented. Reply to this email directly, view it on GitHub , or unsubscribe .
poojgupt commented 4 years ago

Yes, the patient IDs are definitely unique as we are using Jmeter and it does its job correctly. Plus I have also verified the logs for any duplicates.

I can try the ifNoneExists and revert to see if it has any impact.

poojgupt commented 4 years ago

Apparently, ifNoneExist did not have any effect.

realizm commented 8 months ago

Same case. HAPI FHIR SERVER 6.4.0 Postgres 13