Open CaitlinV39 opened 4 years ago
I have just ran into the same issue. I can see there was a PR to fix it: #1386. @brendankowitz, what was the reason it has been closed and not merged?
We have noticed the same issue with conditional creates in our environment. A quick python script to recreate the issue is attached - please note this is "flaky" as the server sometimes does only create one resource!
Is there any way of configuring the system or sending a request so that the server will apply a "stronger" uniqueness condition ?
We have also noticed that the issue occurs much more frequently if the server is under more load, and especially when the CosmosDB is reaching the throttled/configured RU limit. Does anyone have any recommended mitigations for this issue?
FHIR Version? R4
Data provider? CosmosDB
Recreate script
# requirements
# pip install requests fhir.resources==6.1.0
# Recreate conditional create issue with Azure API for FHIR.
# tested with
# python 3.9
# Azure API for FHIR: fhir-R4, CosmosDB (1000 RUs)
import time
import uuid
from datetime import datetime, timezone
from concurrent.futures import ThreadPoolExecutor
from urllib.parse import urlencode
import requests
from fhir.resources.bundle import Bundle
from fhir.resources.identifier import Identifier
from fhir.resources.patient import Patient
# configure environment and auth
FHIR_SERVER_ADDRESS = ''
AZURE_TENANT_ID = ''
OAUTH_CLIENT_ID = ''
OAUTH_CLIENT_SECRET = ''
# configure request flooding
NUM_SIMULTANEOUS_CREATE_ATTEMPTS = 5
NUM_TOTAL_CREATE_ATTEMPTS = 5
# simple OAuth2 client-credentials flow to get a bearer token for the test lifetime
# replace this with your auth of choice
oauth_response = requests.post(
url=f'https://login.microsoftonline.com/{AZURE_TENANT_ID}/oauth2/v2.0/token',
headers={
'Accept': 'application/json',
'Content-Type': 'application/x-www-form-urlencoded'
},
data={
'client_id': OAUTH_CLIENT_ID,
'client_secret': OAUTH_CLIENT_SECRET,
'grant_type': 'client_credentials',
'scope': f'{FHIR_SERVER_ADDRESS}/.default'
}
)
oauth_response.raise_for_status()
authorization_header = f"Bearer {oauth_response.json()['access_token']}"
resource_to_spam = Patient(
identifier=[
Identifier(system="conditional_create_test",
value=str(uuid.uuid4()))
],
# some other random properties, not important
active=True,
deceasedBoolean=False,
)
def attempt_resource_create(resource):
request_started_at = datetime.now(timezone.utc)
response = requests.post(
url=f"{FHIR_SERVER_ADDRESS}/{resource.resource_type}",
data=resource.json(),
headers={
'Authorization': authorization_header,
'If-None-Exist': urlencode({'identifier': f'{resource.identifier[0].system}|{resource.identifier[0].value}'}),
'Content-Type': 'application/fhir+json',
}
)
request_finished_at = datetime.now(timezone.utc)
return response, request_started_at, request_finished_at
def warm_up_threadpool(threadpool, num_worker_threads):
# use a sleep task to force the threadpool implementation to pre-spawn all the worker threads
# this removes the thread creation overhead when checking for race conditions over HTTP
def _dummy_task():
time.sleep(1)
for task in [threadpool.submit(_dummy_task) for _ in range(2 * num_worker_threads)]:
_ = task.result()
with ThreadPoolExecutor(max_workers=NUM_SIMULTANEOUS_CREATE_ATTEMPTS) as threadpool:
warm_up_threadpool(threadpool, NUM_SIMULTANEOUS_CREATE_ATTEMPTS)
attempt_create_tasks = [
threadpool.submit(attempt_resource_create, resource_to_spam)
for _ in range(NUM_TOTAL_CREATE_ATTEMPTS)
]
# give the server a little break to sync before checking result
# this should not be required, but given this script is recreating a race condition ... let's be safe
time.sleep(2)
search_response = requests.get(
url=f"{FHIR_SERVER_ADDRESS}/{resource_to_spam.resource_type}",
params={'identifier': f'{resource_to_spam.identifier[0].system}|{resource_to_spam.identifier[0].value}'},
headers={
'Authorization': authorization_header,
'Accept': 'application/json',
}
)
search_response_bundle = Bundle.parse_raw(search_response.text)
print(f"Attempted create {NUM_TOTAL_CREATE_ATTEMPTS} times, created {len(search_response_bundle.entry or [])} resources.")
print(f"Resource ids:")
for entry in search_response_bundle.entry or []:
print(f"{entry.resource.resource_type}/{entry.resource.id} created at {entry.resource.meta.lastUpdated}")
print("Request Summary")
for task in attempt_create_tasks:
response, request_started_at, request_finished_at = task.result()
print(f"Response {response.status_code} between {request_started_at} and {request_finished_at}")
Example output:
Attempted create 5 times, created 4 resources.
Resource ids:
Patient/b3dad015-cffc-449e-aaed-3359e9f389ee created at 2023-06-23 06:53:23.859000+00:00
Patient/2b174187-b446-44db-8a8c-01181214b2cb created at 2023-06-23 06:53:23.860000+00:00
Patient/410a5cc7-d16b-4ee9-ba59-7f20399df623 created at 2023-06-23 06:53:23.861000+00:00
Patient/b3e7ea02-b536-434b-a6c2-f172e33253c5 created at 2023-06-23 06:53:23.861000+00:00
Request Summary
Response 201 between 2023-06-23 06:53:30.451094+00:00 and 2023-06-23 06:53:30.621145+00:00
Response 201 between 2023-06-23 06:53:30.451094+00:00 and 2023-06-23 06:53:30.633846+00:00
Response 200 between 2023-06-23 06:53:30.452072+00:00 and 2023-06-23 06:53:30.622115+00:00
Response 201 between 2023-06-23 06:53:30.452072+00:00 and 2023-06-23 06:53:30.626023+00:00
Response 201 between 2023-06-23 06:53:30.452072+00:00 and 2023-06-23 06:53:30.616253+00:00
This is still an issue for us, could this be looked at, please?
Thanks for raising the issue and we will look into the same. 549359380
Describe the bug When doing two conditional creates simultaneously (or near simultaneously) for the same resource, both are created, causing duplicate data to be stored in the system
FHIR Version? Stu3/R4
Data provider? CosmosDB/SQL Server
To Reproduce Steps to reproduce the behavior: Following are the steps the customer is following 1 Delete all existing patient data from their instance.
Expected behavior Receive the 412 response for the 2nd create patient request that hits the Azure API for FHIR instance.
Actual behavior 2nd patient is created as a duplicate