When performing multiple parallel FHIR queries (e.g., search operations) that are initiated around the same time, the number of entries returned in the response bundles is inconsistent across queries. The issue is more prevalent under heavy parallelized load, where queries executed nearly simultaneously return different numbers of entries, despite the expected result being the same.
To Reproduce
Steps to reproduce the behavior:
Execute multiple identical FHIR queries in parallel.
Ensure the queries are initiated at almost the same time (e.g., using multiple threads or processes).
Check the number of entries returned in the resulting bundles.
Observe that the number of entries is inconsistent across the responses. (This is not always the case it is hard to reproduce)
I requested all Patient resources in pages of 100 records each, using a script to continuously request data until the next URL is no longer provided in the response. This process was automated to run in parallel across multiple threads, each executing identical requests concurrently.
Expected behavior
I expect that all identical FHIR queries submitted simultaneously return the same number of records in their response bundles, regardless of parallelization or concurrency.
Screenshots
Below is the result showing the result of the query executed on the database:
SELECT * FROM public.hfj_search ORDER BY pid;
pid
created
search_deleted
expiry_or_null
failure_code
failure_message
last_updated_high
last_updated_low
num_blocked
num_found
preferred_page_size
resource_id
resource_type
search_param_map
search_query_string
search_query_string_hash
search_type
search_status
total_count
search_uuid
optlock_version
search_query_string_vc
search_param_map_bin
12642
2024-09-24 11:33:11.082
False
NULL
NULL
NULL
NULL
NULL
0
2772
100
NULL
Patient
NULL
NULL
-1676697053
1
FINISHED
2772
1f9d8280-a643-4652-8b9d-3c45994524f0
7
?_count=100
binary data
12643
2024-09-24 11:33:11.083
False
NULL
NULL
NULL
NULL
NULL
0
2358
100
NULL
Patient
NULL
NULL
-1676697053
1
FINISHED
2358
66176b7f-6e1b-4270-bc55-1949db287367
7
?_count=100
binary data
12644
2024-09-24 11:33:11.084
False
NULL
NULL
NULL
NULL
NULL
0
2358
100
NULL
Patient
NULL
NULL
-1676697053
1
FINISHED
2358
8c1d5ed5-5d0d-4ced-9027-c2124559b745
7
?_count=100
binary data
12645
2024-09-24 11:33:11.085
False
NULL
NULL
NULL
NULL
NULL
0
2403
100
NULL
Patient
NULL
NULL
-1676697053
1
FINISHED
2403
aaa55dfb-a9f4-4f63-9ee3-259f072ad6ea
7
?_count=100
binary data
12646
2024-09-24 11:33:11.082
False
NULL
NULL
NULL
NULL
NULL
0
2772
100
NULL
Patient
NULL
NULL
-1676697053
1
FINISHED
2772
7f769ab5-d115-4e5b-9f22-5649deb20dab
7
?_count=100
binary data
12647
2024-09-24 11:33:11.085
False
NULL
NULL
NULL
NULL
NULL
0
2772
100
NULL
Patient
NULL
NULL
-1676697053
1
FINISHED
2772
b1aa728d-e88d-4353-a0e7-f1abc790fbb1
7
?_count=100
binary data
12648
2024-09-24 11:33:11.082
False
NULL
NULL
NULL
NULL
NULL
0
2772
100
NULL
Patient
NULL
NULL
-1676697053
1
FINISHED
2772
cd4207f2-b31b-42fe-8b4f-37d4a8a8e482
7
?_count=100
binary data
You can see that it is the same query in column search_query_string_vc but the num_found is not always 2772, which is the real amount.
Additional context
This issue might be related to internal concurrency handling within HAPI FHIR when under heavy load. It’s important to note that the issue occurs primarily in parallel execution scenarios.
Description
When performing multiple parallel FHIR queries (e.g., search operations) that are initiated around the same time, the number of entries returned in the response bundles is inconsistent across queries. The issue is more prevalent under heavy parallelized load, where queries executed nearly simultaneously return different numbers of entries, despite the expected result being the same.
To Reproduce
Steps to reproduce the behavior:
Patient
resources in pages of 100 records each, using a script to continuously request data until thenext
URL is no longer provided in the response. This process was automated to run in parallel across multiple threads, each executing identical requests concurrently.Expected behavior
I expect that all identical FHIR queries submitted simultaneously return the same number of records in their response bundles, regardless of parallelization or concurrency.
Screenshots
Below is the result showing the result of the query executed on the database:
You can see that it is the same query in column
search_query_string_vc
but thenum_found
is not always2772
, which is the real amount.Environment
HAPI FHIR Version: 7.2.1 Database: Postgres 16.3 OS: Debian
Additional context This issue might be related to internal concurrency handling within HAPI FHIR when under heavy load. It’s important to note that the issue occurs primarily in parallel execution scenarios.