In this PR, some critical queries (not all) in flows and checkpoints have been optimized for performance. The rationale for transitioning from GORM's 'Preload' method to using 'Joins' for querying: Performance is critical, thus reducing database latency and overhead is a priority.
GORM's 'Preload' executes separate SQL queries to fetch the main record and each associated record. Here’s how it operates.
SQL Execution for Preloading:
Fetch Job:
SELECT * FROM jobs WHERE id = $1 LIMIT 1;
Fetch Flow:
SELECT * FROM flows WHERE id IN (SELECT flow_id FROM jobs WHERE id = $1);
Fetch Tool:
SELECT * FROM tools WHERE id IN (SELECT tool_id FROM jobs WHERE id = $1);
Data Fetching: Each query results in a separate round trip to the database, increasing the total time spent in data retrieval, especially over networked database connections.
Using Joins integrates the fetching of related data into a single SQL query, improving data retrieval efficiency:
SQL Execution with Joins:
SELECT jobs.*, flows.*, tools.*
FROM jobs
JOIN flows ON flows.id = jobs.flow_id
JOIN tools ON tools.cid = jobs.tool_id
WHERE jobs.id = $1 LIMIT 1;
Data Fetching: This method fetches all related data in one go, significantly reducing the number of database round-trips. It is especially beneficial for networked databases or when the dataset grows.
Going even one step ahead,
Instead of selecting all columns with *, specifying only the columns that are necessary for the application's functionality, can reduce memory usage and increase query speed by decreasing the amount of data transferred from the database.
Example SQL Query Using Joins with Specific Columns:
SELECT jobs.id, jobs.status, flows.name, tools.type
FROM jobs
JOIN flows ON flows.id = jobs.flow_id
JOIN tools ON tools.cid = jobs.tool_id
WHERE jobs.id = $1 LIMIT 1;
Explanation: This approach only fetches the id and status from jobs, the name from flows, and the type from tools, instead of all columns in these tables.
Performance Consideration
Network Latency: Joins minimize the impact of network latency between the application and the database by reducing the number of calls made.
Query Optimization: A single join query is often more amenable to optimization through indexing than multiple separate queries.
Switching to Joins from Preload in our GORM data access strategy offers a substantial improvement in performance by reducing the number of database round-trips required to fetch a Job and its related Flow and Tool. This is crucial for a performance-critical application like ours, where reducing latency and enhancing query efficiency directly impacts user experience and system scalability.
Another major slowness is coming from fetching all flows for the side nav:
this is changed to /flows/names endpoint to fetch only names for the side nav:
What type of PR is this?
Description
In this PR, some critical queries (not all) in flows and checkpoints have been optimized for performance. The rationale for transitioning from GORM's 'Preload' method to using 'Joins' for querying: Performance is critical, thus reducing database latency and overhead is a priority.
GORM's 'Preload' executes separate SQL queries to fetch the main record and each associated record. Here’s how it operates.
SQL Execution for Preloading:
Fetch
Job
:Fetch
Flow
:Fetch
Tool
:Data Fetching: Each query results in a separate round trip to the database, increasing the total time spent in data retrieval, especially over networked database connections.
Using
Joins
integrates the fetching of related data into a single SQL query, improving data retrieval efficiency:SQL Execution with Joins:
Data Fetching: This method fetches all related data in one go, significantly reducing the number of database round-trips. It is especially beneficial for networked databases or when the dataset grows.
Going even one step ahead,
Instead of selecting all columns with
*
, specifying only the columns that are necessary for the application's functionality, can reduce memory usage and increase query speed by decreasing the amount of data transferred from the database.Example SQL Query Using Joins with Specific Columns:
GORM Code:
Explanation: This approach only fetches the
id
andstatus
fromjobs
, thename
fromflows
, and thetype
fromtools
, instead of all columns in these tables.Performance Consideration
Switching to
Joins
fromPreload
in our GORM data access strategy offers a substantial improvement in performance by reducing the number of database round-trips required to fetch aJob
and its relatedFlow
andTool
. This is crucial for a performance-critical application like ours, where reducing latency and enhancing query efficiency directly impacts user experience and system scalability.Another major slowness is coming from fetching all flows for the side nav:
this is changed to /flows/names endpoint to fetch only names for the side nav: