The DIRECT_DEPENDENCIES column in the PROJECT and COLUMN tables are currently of type TEXT. For the COMPONENT table, there is a GIN_TRGM_OPS index on the column:
[!WARNING]
There might be multiple queries across the code base that perform LIKE queries on the DIRECT_DEPENDENCIES column. Some may use JDOQL (i.e. directDependencies.contains("...")). In particular the latter is problematic, since JDOQL doesn't support the @> JSON operator. Track those queries down, and investigate whether they can be refactored accordingly.
Demo
[!NOTE]
The test database contains 1000 projects and TODO components. Generated testdata BOMs were used to populate the database.
Query plan using existing TEXT column:
+--------------------------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN |
+--------------------------------------------------------------------------------------------------------------------------------------------------+
|Bitmap Heap Scan on "COMPONENT" (cost=621.77..665.18 rows=11 width=14) (actual time=7.739..17.003 rows=3 loops=1) |
| Recheck Cond: ("DIRECT_DEPENDENCIES" ~~ '%ed521419-f3ac-4331-ac93-7219b5420133%'::text) |
| Rows Removed by Index Recheck: 14 |
| Heap Blocks: exact=17 |
| -> Bitmap Index Scan on "COMPONENT_DIRECT_DEPENDENCIES_GIN_IDX" (cost=0.00..621.77 rows=11 width=0) (actual time=7.551..7.551 rows=17 loops=1)|
| Index Cond: ("DIRECT_DEPENDENCIES" ~~ '%ed521419-f3ac-4331-ac93-7219b5420133%'::text) |
|Planning Time: 0.427 ms |
|Execution Time: 17.213 ms |
+--------------------------------------------------------------------------------------------------------------------------------------------------+
Size of index COMPONENT_DIRECT_DEPENDENCIES_GIN_IDX:
SELECT pg_size_pretty(pg_relation_size(indexrelid)) "Index Size"
FROM pg_stat_all_indexes i
INNER JOIN pg_class c
ON i.relid=c.oid
WHERE i.relname='COMPONENT'
AND indexrelname = 'COMPONENT_DIRECT_DEPENDENCIES_GIN_IDX';
106 MB
Query plan using JSONB column:
+----------------------------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
|Bitmap Heap Scan on "COMPONENT" (cost=41.29..10042.07 rows=3823 width=14) (actual time=0.130..0.135 rows=3 loops=1) |
| Recheck Cond: ("DIRECT_DEPENDENCIES" @> '[{"uuid": "ed521419-f3ac-4331-ac93-7219b5420133"}]'::jsonb) |
| Heap Blocks: exact=3 |
| -> Bitmap Index Scan on "COMPONENT_DIRECT_DEPENDENCIES_JSONB_IDX" (cost=0.00..40.33 rows=3823 width=0) (actual time=0.010..0.010 rows=3 loops=1)|
| Index Cond: ("DIRECT_DEPENDENCIES" @> '[{"uuid": "ed521419-f3ac-4331-ac93-7219b5420133"}]'::jsonb) |
|Planning Time: 0.100 ms |
|Execution Time: 0.163 ms |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
Size of index COMPONENT_DIRECT_DEPENDENCIES_JSONB_IDX:
SELECT pg_size_pretty(pg_relation_size(indexrelid)) "Index Size"
FROM pg_stat_all_indexes i
INNER JOIN pg_class c
ON i.relid=c.oid
WHERE i.relname='COMPONENT'
AND indexrelname = 'COMPONENT_DIRECT_DEPENDENCIES_JSONB_IDX';
21 MB
Current Behavior
The
DIRECT_DEPENDENCIES
column in thePROJECT
andCOLUMN
tables are currently of typeTEXT
. For theCOMPONENT
table, there is aGIN_TRGM_OPS
index on the column:https://github.com/DependencyTrack/hyades/blob/96af03d19b004a2c41763dabbb7d082180c99e3c/commons-persistence/src/main/resources/schema.sql#L809-L810
In order to perform dependency graph traversal, queries are using
LIKE '%...%'
conditions such as:While this works, there are more efficient ways to achieve the desired outcome in PostgreSQL.
Proposed Behavior
Convert the
DIRECT_DEPENDENCIES
column in thePROJECT
andCOMPONENT
tables toJSONB
.Drop the existing
COMPONENT_DIRECT_DEPENDENCIES_GIN_IDX
index, and create a new one usingJSONB_PATH_OPS
, for example:It will then be possible to leverage that index using the JSONB contains operator:
Demo
Query plan using existing
TEXT
column:Size of index
COMPONENT_DIRECT_DEPENDENCIES_GIN_IDX
:Query plan using
JSONB
column:Size of index
COMPONENT_DIRECT_DEPENDENCIES_JSONB_IDX
:Checklist