drajer-health / eCRNow

Repository for eCRNow app.
Apache License 2.0
41 stars 55 forks source link

Performance Issue observed with increased number of rows in ph_messages, scheduled_tasks table. #519

Open lakshman301195 opened 1 year ago

lakshman301195 commented 1 year ago
  1. When performing load testing with 10L check reportable jobs, the RDS CPU spikes to 95% (db.r5.xlarge Instance) with 30k rows in ph_messages table . On analysis we found that, there are two places in the app, where the same query gets called. We can avoid the second call by reusing the data from first call. RDS CPU Spikes comes down, when an index on the NOTIFIED_RESOURCE_ID Column in ph_messages table is added.

Query: select this_.id as id1_100, this_.cda_response_data as cda_resp2_100, this_.encounter_id as encounte3_100, this_.failure_response_data as failure_4_100, this_.fhir_response_data as fhir_res5_100, this_.fhir_server_base_url as fhir_ser6_100, this_.initiating_action as initiati7_100, this_.kar_unique_id as kar_uniq8_100, this_.last_updated_ts as last_upd9_100, this_.notification_id as notific10_100, this_.notified_resource_id as notifie11_100, this_.notified_resource_type as notifie12_100, this_.patient_id as patient13_100, this_.patient_linker_id as patient14_100, this_.response_data_id as respons15_100, this_.ehr_doc_ref_id as ehr_doc16_100, this_.response_message_id as respons17_100, this_.response_message_type as respons18_100, this_.response_processing_instruction as respons19_100, this_.response_processing_status as respons20_100, this_.response_received_time as respons21_100, this_.submission_message_status as submiss22_100, this_.submission_time as submiss23_100, this_.submitted_cda_data as submitt24_100, this_.submitted_data_id as submitt25_100, this_.submitted_fhir_data as submitt26_100, this_.submitted_message_id as submitt27_100, this_.submitted_message_type as submitt28_100, this_.submitted_version_number as submitt29_100, this_.trigger_match_status as trigger30_100, this_.correlation_id as correla31_100, this_.x_request_id as x_reque32_100 from phmessages this where this_.fhir_server_baseurl=? and this.patientid=? and this.notified_resourceid=? and this.notified_resourcetype=? and this.kar_uniqueid=? order by this.submitted_version_number desc

  1. When performing load testing with 10L check reportable jobs, the RDS CPU spikes to 95% (db.r5.xlarge Instance) with 10L rows in scheduled_tasks table. On analysis we found that, the DB Scheduler query which picks tasks from scheduled_tasks table for execution was causing the issue. RDS CPU Spikes comes down, when an index on the EXECUTION_TIME Column in scheduled_tasks table is added.

Query: EXPLAIN UPDATE scheduled_tasks st1 SET picked = ?, picked_by = ?, last_heartbeat = ?, version = (version+?) WHERE (st1.task_name, st1.task_instance) IN ( SELECT st2.task_name, st2.task_instance FROM scheduled_tasks st2 WHERE picked = false and execution_time <= ? order by execution_time asc FOR UPDATE SKIP LOCKED LIMIT ?) RETURNING st1.*

nbashyam commented 1 year ago

We will add the index for the column.

nbashyam commented 1 year ago

The following indexes will be added - execution_time on scheduled tasks and notified_resource_id will be added on ph_messages.