google / android-fhir

The Android FHIR SDK is a set of Kotlin libraries for building offline-capable, mobile-first healthcare applications using the HL7® FHIR® standard on Android.
https://google.github.io/android-fhir/
Apache License 2.0
465 stars 246 forks source link

Improve FhirEngine Search performance for sorting #2547

Closed LZRS closed 2 weeks ago

LZRS commented 1 month ago

Describe the bug When loading around 7k patients ordered by Patient.Name, the db query takes around 25s Without sorting the query takes around 35ms

This is when using

fhirEngine.search<Patient> {
        filter(Patient.ACTIVE, { value = of(true) })
        sort(Patient.NAME, Order.ASCENDING)
        count = PaginationConstant.DEFAULT_PAGE_SIZE
        from = currentPage * PaginationConstant.DEFAULT_PAGE_SIZE
      }

that generates SQL query

SELECT a.resourceUuid, a.serializedResource
FROM ResourceEntity a
         LEFT JOIN StringIndexEntity b
                   ON a.resourceType = b.resourceType AND a.resourceUuid = b.resourceUuid AND b.index_name = 'name'
WHERE a.resourceType = 'Patient'
  AND a.resourceUuid IN (SELECT resourceUuid
                         FROM TokenIndexEntity
                         WHERE resourceType = 'Patient'
                           AND index_name = 'active'
                           AND index_value = 'true')
ORDER BY b.index_value ASC
LIMIT 20 OFFSET 0

Tested out that removing the ORDER BY b.index_value ASC reduces the latency of the query to milliseconds

Expected behavior The usecase of loading resources with sort should be more performant

Additional context Here's a link to the resources db containing around 7k patients, that can be used for testing https://drive.google.com/file/d/1TgPw36BFI8BijsUPc1JkQE3l3RXANYzT/view?usp=drive_link

Would you like to work on the issue? Please state if this issue should be assigned to you or who you think could help to solve this issue.

LZRS commented 1 month ago

@aditya-07 @MJ1998 here's the ticket about the issue on performance we're experiencing

MJ1998 commented 1 month ago

Can you tell me how does it perform without filter of active Patients ?

MJ1998 commented 1 month ago

One improvement I can think of is removing a.resourceType = b.resourceType clause in JOIN statement as we are filtering resourceType using WHERE clause. But not sure if this is feasible or even if it's gonna improve the performance

ndegwamartin commented 1 month ago

It is worth noting that the above latency difference is observable even on non-device sqlite db browsers like SQLiteStudio.

ndegwamartin commented 1 month ago

Can you tell me how does it perform without filter of active Patients ?

Removing the subquery that performs this filter has no impact on the performance.

ndegwamartin commented 1 month ago

One improvement I can think of is removing a.resourceType = b.resourceType clause in JOIN statement as we are filtering resourceType using WHERE clause. But not sure if this is feasible or even if it's gonna improve the performance

Removing this does improve the performance as well, executing the query in milliseconds cc LZRS

jingtang10 commented 3 weeks ago

thank you very much for catching this @LZRS! very happy for the improvement in @MJ1998 's pr.

just one comment for clarification and potential future work: https://github.com/google/android-fhir/pull/2553#issuecomment-2142169138