Closed ndegwamartin closed 2 months ago
Gotcha, we'll proceed as discussed 👍
Based on this EXPLAIN QUERY PLAN result:
SEARCH TABLE ResourceEntity USING INDEX index_ResourceEntity_resourceType_resourceId (resourceType=?)
@ndegwamartin @ellykits @dubdabasoduba
@pld
By partitioning, do you mean having separate tables per resource type? @FikriMilano
Based on this EXPLAIN QUERY PLAN result:
SEARCH TABLE ResourceEntity USING INDEX index_ResourceEntity_resourceType_resourceId (resourceType=?)
- The DB is already indexing based on resourceType and fields. For indexing based on fields, the indexed fields are only fields that can be used as search parameter, fields that is not a search parameter won’t be indexed, which makes sense since indexing would only be effective if the fields are used as a sort or filtering query. For these reason I don’t think we can improve this anymore by indexing.
- After some R&D, I think it will be a good idea to partition based on resourceType instead, since we are mostly search based on resourceType and there is a fixed amount of resourceType on FHIR, that way, the db will do much less scanning on the rows, assuming common resourceTypes that has a lot of rows (QR, Task, Patient, Encounter) are separated into it’s own partition. That way, we can have a faster query.
@FikriMilano do you suggest vertical
or horizontal
partitioning? Would this result in any refactors on how we currently query?
By partitioning, do you mean having separate tables per resource type? @FikriMilano
@ellykits yes
@FikriMilano do you suggest
vertical
orhorizontal
partitioning? Would this result in any refactors on how we currently query?
@ageryck horizontal
partitioning, the idea is to have less rows to scan.
No, there won't be refactor on our queries, because we are using the FHIR Engine API, not raw queries.
The refactor will only happen in the FHIR Engine library
Is ReosurceType indexed? Is it a key in queries?On Apr 2, 2024, at 08:19, FikriMilano @.***> wrote: The refactor will only happen in the FHIR Engine library
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
@FikriMilano, could you also confirm whether it is a composite index or not?
@pld @ndegwamartin
@FikriMilano I wonder if for cases where we have filters by ResourceType
we might need a non-composite index to speed things up, esp. if we have any queries that filter by column this directly. The column may have a low cardinality but we might still get a significant performance gain. Might be a good idea to test this with our use case.
@ndegwamartin I'm thinking the same, let's see
@ndegwamartin after the experiment, there are no performance difference in having resourceType as single column index, whether it's together with or without the composite index.
Also, after reading this article, the experiment result matches the theory mentioned in this article, mainly about "having a single column index of [column X], will perform the same with composite index if [column X] is the first order" https://user3141592.medium.com/single-vs-composite-indexes-in-relational-databases-58d0eb045cbe
@ndegwamartin should we give partitioning a try? in a different ticket maybe?
@FikriMilano thanks for this, yeah we can close this ticket and discuss the other alternatives + track on a different ticket. cc @ellykits
Overview We need to investigate the use of indexes on the platform with respect to FHIR Core. This will help determine whether we need to index extra/missing fields to improve performance at the db level.
Some steps to guide the process are