microsoft / fhir-server

A service that implements the FHIR standard
MIT License
1.15k stars 491 forks source link

Reindex job stuck at "Queued" #3684

Open elgib opened 5 months ago

elgib commented 5 months ago

Describe the bug After submitting a reindex job, the job remains queued indefinitely with no progress. The issue persists after deleting and resubmitting the job. Similar to #2200 ?

FHIR Version? R4

Data provider? CosmosDB

To Reproduce Steps to reproduce the behavior:

  1. Create search parameter, test on a single resource (success)
  2. Submit reindex job for new parameter (success)
  3. Query reindex job ID - status is stuck at "Queued"

Expected behavior Reindex job should run

Actual behavior Reindex job remains "Queued"

EXPEkesheth commented 5 months ago

Thanks for reporting the issue @elgib - Are you observing this issue on OSS FHIR service or are you using managed Azure FHIR service (Azure API for FHIR / Azure Health Data Services)?

elgib commented 5 months ago

@EXPEkesheth we're using the OSS FHIR service. The issue is reproducible on both our dev and test environments, and with different search parameters (including one that was successfully reindexed a few months ago). Also no obvious errors in the logs.

EXPEkesheth commented 5 months ago

@elgib , have added in our queue for investigation. Can you please share with us search parameter json used for creating custom search parameter. Will inform once we have more details/ questions. #114359

elgib commented 5 months ago

@EXPEkesheth here's an example of one of our custom search parameters.

{ "resourceType": "SearchParameter", "id": "e46bd3c4-f278-4039-841b-892e931596fe", "meta": { "versionId": "1", "lastUpdated": "2023-11-22T12:54:08.333+00:00" }, "url": "http://1beat.care/fhir/search-parameters#patient-care-unit", "name": "patient-care-unit", "status": "draft", "description": "Reference to Organization resource that represents the care unit currently responsible for the patient.", "code": "care-unit", "base": [ "Patient" ], "type": "reference", "expression": "Patient.extension.where(url = 'http://1beat.care/fhir/extensions#patient-care-unit').value", "target": [ "Organization" ] }

Thanks for looking into this. We would appreciate any updates as this is blocking key areas of work for our team. Any short-term recommendations would also be helpful -- for example, should we try rolling back to an earlier version?

elgib commented 5 months ago

@EXPEkesheth are there any updates on this issue, or timeframes for a fix?

EXPEkesheth commented 5 months ago

@elgib - As you are using the OSS FHIR service , you would need to explicitly enable Reindex in deployment template (https://github.com/microsoft/fhir-server/blob/main/samples/templates/default-azuredeploy-docker.json#L272). Please ensure this setting is enabled in your instance. Have you used reindex capability in FHIR server OSS before?

elgib commented 5 months ago

@EXPEkesheth we have enabled reindex operations and run several reindex jobs successfully in the past. The last successful reindex was late November 2023. We have not made any changes to our setup since then.

EXPEkesheth commented 5 months ago

@elgib Thanks for the information. We will look into the issue and get back incase of any questions.