Open andresoviedo opened 2 years ago
Hi @andresoviedo thanks for your report. Here are some answers for your questions:
Do you make any sense out of this?
I use PosgtreSQL in production and MySQL partially for development and have not had issues, but don't have the amounts of data you have. So it's hard for me to reproduce it but your detailed investigations seems to make it clear where the problem is.
Do you have any advice on this SQL performance issue?
I am afraid nothing the might work for you. Users with global roles will not use these complicated queries and be much faster. But they can read or write every product with their findings. Switching to PostgreSQL will most likely not be an option for you, otherwise you might have done that already.
Does the latest Dojo version has better design/performance?
No, these queries haven't been changed.
Do you have any performance metrics available somewhere, so I can compare with mines?
See above, no.
Should I submit a Pull Request to refactor queries using EXISTS to be INNER JOIN?
I personally wouldn't know how to do it with Django, but if you have an idea it might be worth a try.
Dear @StefanFl Thank you for your feedback
Here is 1 fully optimized indexed query that runs with no delay It is four union queries that gets the 4 possible authorizations 1 user may have for viewing a finding I will do my best to continue contributing on this matter...
`
SELECT distinct dojo_finding
.cwe
FROM dojo_finding
INNER JOIN dojo_test
ON (dojo_finding
.test_id
= dojo_test
.id
)
INNER JOIN dojo_engagement
ON (dojo_test
.engagement_id
= dojo_engagement
.id
)
INNER JOIN dojo_product
ON (dojo_engagement
.product_id
= dojo_product
.id
)
INNER JOIN dojo_product_type_member
ON (dojo_product_type_member
.product_type_id
= dojo_product
.prod_type_id
AND dojo_product_type_member
.role_id
IN (1, 2, 3, 4, 5)
AND dojo_product_type_member
.user_id
= 542)
UNION
SELECT distinct dojo_finding
.cwe
FROM dojo_finding
INNER JOIN dojo_test
ON (dojo_finding
.test_id
= dojo_test
.id
)
INNER JOIN dojo_engagement
ON (dojo_test
.engagement_id
= dojo_engagement
.id
)
INNER JOIN dojo_product
ON (dojo_engagement
.product_id
= dojo_product
.id
)
INNER JOIN dojo_product_member
ON (dojo_product_member
.product_id
= dojo_engagement
.product_id
AND dojo_product_member
.role_id
IN (1, 2, 3, 4, 5)
AND dojo_product_member
.user_id
= 542)
UNION
SELECT distinct dojo_finding
.cwe
FROM dojo_finding
INNER JOIN dojo_test
ON (dojo_finding
.test_id
= dojo_test
.id
)
INNER JOIN dojo_engagement
ON (dojo_test
.engagement_id
= dojo_engagement
.id
)
INNER JOIN dojo_product
ON (dojo_engagement
.product_id
= dojo_product
.id
)
INNER JOIN dojo_product_type_group
ON (dojo_product_type_group
.product_type_id
= dojo_product
.prod_type_id
AND dojo_product_type_group
.role_id
IN (1, 2, 3, 4, 5))
INNER JOIN dojo_dojo_group
ON (dojo_product_type_group
.group_id
= dojo_dojo_group
.id
)
INNER JOIN dojo_dojo_group_member
ON (dojo_dojo_group
.id
= dojo_dojo_group_member
.group_id
AND dojo_dojo_group_member
.user_id
= 542)
UNION
SELECT distinct dojo_finding
.cwe
FROM dojo_finding
INNER JOIN dojo_test
ON (dojo_finding
.test_id
= dojo_test
.id
)
INNER JOIN dojo_engagement
ON (dojo_test
.engagement_id
= dojo_engagement
.id
)
INNER JOIN dojo_product
ON (dojo_engagement
.product_id
= dojo_product
.id
)
INNER JOIN dojo_product_group
ON (dojo_product_group
.product_id
= dojo_engagement
.product_id
AND dojo_product_group
.role_id
IN (1, 2, 3, 4, 5))
INNER JOIN dojo_dojo_group
ON (dojo_product_group
.group_id
= dojo_dojo_group
.id
)
INNER JOIN dojo_dojo_group_member
ON (dojo_dojo_group
.id
= dojo_dojo_group_member
.group_id
AND dojo_dojo_group_member
.user_id
= 542)
WHERE dojo_finding
.active
`
Now we would need to find out, how to write the Python code so that Django generates a SQL statement like this. I currently have no idea how to do this to be honest.
Hey @andresoviedo , We are having performance issues (specially for search/filtering operations) with much less data in our DB (250 products and 50000 findings). Could you share the configuration and the version you used for deploying Defectdojo? We want to adjust CPU and Memory to meet our requirements.
Hi @farsheedify Version 2.8.0 Cpu/ram 8 cores/32gb
Hi @farsheedify Version 2.8.0 Cpu/ram 8 cores/32gb
Thanks Andre. Have you deployed databse on a separate VM, and a single worker for DefectDojo?
Dear team,
Issue I am experiencing issues on the SQL performance of DefectDojo Webapplication. I have a 3000 users, 1000 product types, 3000 products and around 5 million Findings registered in MySQL database. The biggest performance issue I notice is at "login time"
workaround I have augmented RAM to 32GB for the database to fit all indexes in memory It seems it had reduced from 5 minutes waiting for queries to 10-20 seconds. However, I think this times should be still improved.
analysis I enabled the slow-query-log and I checked the slow queries I saw that most of the queries uses a strategy of using EXISTS to filter the authorized/viewable Findings for a Dojo User For all those queries, SQL EXPLAIN command, basically shows that it scans the full table (5 million records) :(
The database has all the default Dojo indexes
According to MySQL documentation, EXISTS (subquery) is executed for each and every Finding to determine if it should be included in the SQL results.
https://dev.mysql.com/doc/refman/8.0/en/subquery-optimization-with-exists.html
MySQL evaluates queries “from outside to inside.” That is, it first obtains the value of the outer expression outer_expr, and then runs the subquery and captures the rows that it produces.
tests The SQL example below gets all Findings for the specific user_id = 558, besides from filtering the "active" findings in the WHERE clause.
I did refactor the query to use INNER JOIN instead of EXISTS to get the authorized viewable Findings I got splendid results (manual test in MySQL Shell) and EXPLAIN command shows that the new query uses fully the indexes :)
For this "refactored query", at the end, what I understand, is that the INNER JOIN clause gets first all the test_id associated to a user, so it's directly filtered by the indexes and lastly, it performs a SCAN just over a few records for the WHERE clause.
Please.
Many thanks Andrés
`# Time: 2022-06-08T11:38:50.795012Z
Query_time: 18.619581 Lock_time: 0.000442 Rows_sent: 25 Rows_examined: 6029876
SET timestamp=1654688330; SELECT
dojo_finding
.id
,dojo_finding
.title
,dojo_finding
.date
,dojo_finding
.sla_start_date
,dojo_finding
.cwe
,dojo_finding
.cve
,dojo_finding
.cvssv3
,dojo_finding
.cvssv3_score
,dojo_finding
.url
,dojo_finding
.severity
,dojo_finding
.description
,dojo_finding
.mitigation
,dojo_finding
.impact
,dojo_finding
.steps_to_reproduce
,dojo_finding
.severity_justification
,dojo_finding
.refs
,dojo_finding
.test_id
,dojo_finding
.active
,dojo_finding
.verified
,dojo_finding
.false_p
,dojo_finding
.duplicate
,dojo_finding
.duplicate_finding_id
,dojo_finding
.out_of_scope
,dojo_finding
.risk_accepted
,dojo_finding
.under_review
,dojo_finding
.last_status_update
,dojo_finding
.review_requested_by_id
,dojo_finding
.under_defect_review
,dojo_finding
.defect_review_requested_by_id
,dojo_finding
.is_mitigated
,dojo_finding
.thread_id
,dojo_finding
.mitigated
,dojo_finding
.mitigated_by_id
,dojo_finding
.reporter_id
,dojo_finding
.numerical_severity
,dojo_finding
.last_reviewed
,dojo_finding
.last_reviewed_by_id
,dojo_finding
.param
,dojo_finding
.payload
,dojo_finding
.hash_code
,dojo_finding
.line
,dojo_finding
.file_path
,dojo_finding
.component_name
,dojo_finding
.component_version
,dojo_finding
.static_finding
,dojo_finding
.dynamic_finding
,dojo_finding
.created
,dojo_finding
.scanner_confidence
,dojo_finding
.sonarqube_issue_id
,dojo_finding
.unique_id_from_tool
,dojo_finding
.vuln_id_from_tool
,dojo_finding
.sast_source_object
,dojo_finding
.sast_sink_object
,dojo_finding
.sast_source_line
,dojo_finding
.sast_source_file_path
,dojo_finding
.nb_occurences
,dojo_finding
.publish_date
,dojo_finding
.service
, EXISTS(SELECT (1) ASa
FROMdojo_product_type_member
U0 WHERE (U0.product_type_id
=dojo_product
.prod_type_id
AND U0.role_id
IN (1, 2, 3, 4, 5) AND U0.user_id
= 558) LIMIT 1) AStest__engagement__product__prod_type__member
, EXISTS(SELECT (1) ASa
FROMdojo_product_member
U0 WHERE (U0.product_id
=dojo_engagement
.product_id
AND U0.role_id
IN (1, 2, 3, 4, 5) AND U0.user_id
= 558) LIMIT 1) AStest__engagement__product__member
, EXISTS(SELECT (1) ASa
FROMdojo_product_type_group
U0 INNER JOINdojo_dojo_group
U1 ON (U0.group_id
= U1.id
) INNER JOINdojo_dojo_group_member
U2 ON (U1.id
= U2.group_id
) WHERE (U2.user_id
= 558 AND U0.product_type_id
=dojo_product
.prod_type_id
AND U0.role_id
IN (1, 2, 3, 4, 5)) LIMIT 1) AStest__engagement__product__prod_type__authorized_group
, EXISTS(SELECT (1) ASa
FROMdojo_product_group
U0 INNER JOINdojo_dojo_group
U1 ON (U0.group_id
= U1.id
) INNER JOINdojo_dojo_group_member
U2 ON (U1.id
= U2.group_id
) WHERE (U2.user_id
= 558 AND U0.product_id
=dojo_engagement
.product_id
AND U0.role_id
IN (1, 2, 3, 4, 5)) LIMIT 1) AStest__engagement__product__authorized_group
, COUNT(CASE WHEN NOTdojo_endpoint_status
.mitigated
THENdojo_finding_endpoint_status
.endpoint_status_id
ELSE NULL END) ASactive_endpoint_count
, COUNT(CASE WHENdojo_endpoint_status
.mitigated
THENdojo_finding_endpoint_status
.endpoint_status_id
ELSE NULL END) ASmitigated_endpoint_count
FROMdojo_finding
INNER JOINdojo_test
ON (dojo_finding
.test_id
=dojo_test
.id
) INNER JOINdojo_engagement
ON (dojo_test
.engagement_id
=dojo_engagement
.id
) INNER JOINdojo_product
ON (dojo_engagement
.product_id
=dojo_product
.id
) LEFT OUTER JOINdojo_finding_endpoint_status
ON (dojo_finding
.id
=dojo_finding_endpoint_status
.finding_id
) LEFT OUTER JOINdojo_endpoint_status
ON (dojo_finding_endpoint_status
.endpoint_status_id
=dojo_endpoint_status
.id
) WHERE ((EXISTS(SELECT (1) ASa
FROMdojo_product_type_member
U0 WHERE (U0.product_type_id
=dojo_product
.prod_type_id
AND U0.role_id
IN (1, 2, 3, 4, 5) AND U0.user_id
= 558) LIMIT 1) OR EXISTS(SELECT (1) ASa
FROMdojo_product_member
U0 WHERE (U0.product_id
=dojo_engagement
.product_id
AND U0.role_id
IN (1, 2, 3, 4, 5) AND U0.user_id
= 558) LIMIT 1) OR EXISTS(SELECT (1) ASa
FROMdojo_product_type_group
U0 INNER JOINdojo_dojo_group
U1 ON (U0.group_id
= U1.id
) INNER JOINdojo_dojo_group_member
U2 ON (U1.id
= U2.group_id
) WHERE (U2.user_id
= 558 AND U0.product_type_id
=dojo_product
.prod_type_id
AND U0.role_id
IN (1, 2, 3, 4, 5)) LIMIT 1) OR EXISTS(SELECT (1) ASa
FROMdojo_product_group
U0 INNER JOINdojo_dojo_group
U1 ON (U0.group_id
= U1.id
) INNER JOINdojo_dojo_group_member
U2 ON (U1.id
= U2.group_id
) WHERE (U2.user_id
= 558 AND U0.product_id
=dojo_engagement
.product_id
AND U0.role_id
IN (1, 2, 3, 4, 5)) LIMIT 1)) ANDdojo_finding
.active
) GROUP BYdojo_finding
.id
ORDER BYdojo_finding
.numerical_severity
ASC LIMIT 25;`