Closed itaranto closed 11 months ago
Anyway, as a side-note, I tried the second approach directly into a sample DB and the performance of that query is just terrible.
I may need to evaluate a different approach about how to populate this.
Yeah, you'll need to be careful with the number of joins. Although the number of joins is not limited with jet, the performance can suffer if the query returns a huge number of rows.
For instance, in your first query, you'll end up with a row count equal to vulnerabilities row count
x children row count
. If there are 10 vulnerabilities and 10 children you'll end up with 100 rows, which should be fine. But if you also want to get children's vulnerabilities, you might end up with a huge number of rows.
One possible way to do it is to have 2 queries. In the first query, you fill model.Components
and Vulnerabilities
and in the second Children
.
type component struct {
model.Components //<-- first query
Vulnerabilities []model.Vulnerabilities //<-- first query
Children []component //<-- second query
}
If stmt1
and stmt2
are two queries, you'll scan:
var dest component
err := stmt1.Query(db, &dest)
err := stmt2.Query(db, &dest.Children)
Yeah, you'll need to be careful with the number of joins. Although the number of joins is not limited with jet, the performance can suffer if the query returns a huge number of rows. For instance, in your first query, you'll end up with a row count equal to
vulnerabilities row count
xchildren row count
. If there are 10 vulnerabilities and 10 children you'll end up with 100 rows, which should be fine. But if you also want to get children's vulnerabilities, you might end up with a huge number of rows.One possible way to do it is to have 2 queries. In the first query, you fill
model.Components
andVulnerabilities
and in the secondChildren
.type component struct { model.Components //<-- first query Vulnerabilities []model.Vulnerabilities //<-- first query Children []component //<-- second query }
If
stmt1
andstmt2
are two queries, you'll scan:var dest component err := stmt1.Query(db, &dest) err := stmt2.Query(db, &dest.Children)
Thanks for the suggestion, but I cannot do the query you proposed since I need to map the results to a slice of components (I should have made that clear from the start).
Like this:
components := []*component{}
if err := stmt.Query(db, &components); err != nil {
return nil, err
}
And, if I resolve it naively, it results in the N+1 problem.
Well, I came up with a query that returns the correct information, but I don't know how to map it using the QRM:
stmt:= postgres.SELECT(
table.Components.AllColumns,
table.Vulnerabilities.AllColumns,
tableChildren.AllColumns,
tableChildrenVulnerabilities.AllColumns,
).FROM(
table.Components.
LEFT_JOIN(
table.ComponentVulnerabilities,
table.ComponentVulnerabilities.ComponentID.EQ(table.Components.ID),
).
LEFT_JOIN(
table.Vulnerabilities,
table.Vulnerabilities.ID.EQ(table.ComponentVulnerabilities.VulnerabilityID),
).
LEFT_JOIN(
table.ComponentChildren,
table.ComponentChildren.ComponentID.EQ(table.Components.ID),
).
LEFT_JOIN(
tableChildren,
tableChildren.ID.EQ(table.ComponentChildren.ChildID),
).
LEFT_JOIN(
tableChildrenComponentVulnerabilities,
tableChildrenComponentVulnerabilities.ComponentID.EQ(tableChildren.ID),
).
LEFT_JOIN(
tableChildrenVulnerabilities,
tableChildrenVulnerabilities.ID.EQ(tableChildrenComponentVulnerabilities.VulnerabilityID),
),
)
In SQL (simplified version with just IDs):
SELECT components.id AS "components.id",
vulnerabilities.id AS "vulnerabilities.id",
children.id AS "children.id",
children_vulnerabilities.id AS "children_vulnerabilities.id"
FROM components
LEFT JOIN component_vulnerabilities ON (component_vulnerabilities.component_id = components.id)
LEFT JOIN vulnerabilities ON (vulnerabilities.id = component_vulnerabilities.vulnerability_id)
LEFT JOIN component_children ON (component_children.component_id = components.id)
LEFT JOIN components AS children ON (children.id = component_children.child_id)
LEFT JOIN component_vulnerabilities AS children_component_vulnerabilities ON (children_component_vulnerabilities.component_id = children.id)
LEFT JOIN vulnerabilities AS children_vulnerabilities ON (children_vulnerabilities.id = children_component_vulnerabilities.vulnerability_id);
Aha, in that case your destination would need to have a separate type for children. For instance:
type component struct {
model.Components
Vulnerabilities []model.Vulnerabilities
Children []struct{
model.Components `alias:"children"`
Vulnerabilities []model.Vulnerabilities `alias:"children_vulnerabilities"`
}
}
You can't use the same component
type for children because qrm
needs to differentiate component and component children destination type. Also, you can't alias complex type, only model and custom model.
Still, beware of the number of rows this query can return, and maybe consider pagination or some other approach.
Aha, in that case your destination would need to have a separate type for children. For instance:
type component struct { model.Components Vulnerabilities []model.Vulnerabilities Children []struct{ model.Components `alias:"children"` Vulnerabilities []model.Vulnerabilities `alias:"children_vulnerabilities"` } }
You can't use the same
component
type for children becauseqrm
needs to differentiate component and component children destination type. Also, you can't alias complex type, only model and custom model.
That worked! This library rocks!
Still, beware of the number of rows this query can return, and maybe consider pagination or some other approach.
Yes, I'm aware of this, thank you.
Sorry to re-open this...
Here's the thing: I thought it worked, but using the query I shared above, it only fills the first vulnerability for each child.
For what I understand, given that my query returns something like:
components.id | vulnerabilities.id | children.id | children_vulnerabilities.id
---------------+--------------------+--------------+-----------------------------
component_00 | vulnerability_00 | component_01 | vulnerability_11
component_00 | vulnerability_00 | component_01 | vulnerability_12
component_00 | vulnerability_01 | component_01 | vulnerability_11
component_00 | vulnerability_01 | component_01 | vulnerability_12
component_00 | vulnerability_02 | component_02 | vulnerability_21
component_00 | vulnerability_03 | component_02 | vulnerability_21
...
then, using the following structs:
type component struct {
model.Components
Vulnerabilities []model.Vulnerabilities
Children []child
}
type child struct {
model.Components `alias:"children"`
Vulnerabilities []model.Vulnerabilities `alias:"children_vulnerabilities"`
}
the QRM should populate the structs to something like:
{
"components": [
{
"id": "component_00",
"vulnerabilities": [
{
"id": "vulnerability_00"
},
{
"id": "vulnerability_01"
},
{
"id": "vulnerability_02"
},
{
"id": "vulnerability_03"
}
],
"children": [
{
"id": "component_01",
"vulnerabilities": [
{
"id": "vulnerability_11"
},
{
"id": "vulnerability_12"
}
]
},
{
"id": "component_02",
"vulnerabilities": [
{
"id": "vulnerability_21"
}
]
}
]
}
]
}
Am I doing something wrong?
Is there a more efficient way to debug the QRM?
Hi @itaranto ,
This is a bug. alias:"children_vulnerabilities"
is not accounted when qrm group key is constructed. I've made a fix on bug252
branch. Give it a try.
On master branch, workaround can be made by wrapping model.Vulnerabilities
into new struct:
type child struct {
model.Components `alias:"children"`
Vulnerabilities []struct {
model.Vulnerabilities `alias:"children_vulnerabilities"`
}
}
Well, I'm glad this is a bug then...
I tried your branch and it solved my issue in regards to only populating the first vulnerability for each children.
Now, there's another similar issue this bug252
branch solved, but instead of reproducing it in my unit tests with a dummy database, I found this in a real one.
The problem was the following: Given lots of children of a given component that have the same vulnerability, then the mapper would map all of those (duplicated) vulnerabilities into the first child and not the others.
My guess is these two issues are basically the same.
Basically, a bug is if you have two model slices of the same type in your destination, and one slice is aliased then qrm would not populate destination correctly:
type struct Dest {
....
TList []model.T
....
TList2 []model.T `alias:"t_alias"`
....
}
// if insetead TList2 []model.T `alias:"t_alias"` a new struct is used,
// like TList2 struct{ []model.T `alias:"t_alias"` } then it's not an issue
This usually requires, like in your case, self join and 2 N joins on the same level, which is not a common case, and probably a reason why it hasn't been detected so far.
The problem was the following: Given lots of children of a given component that have the same vulnerability, then the mapper would map all of those (duplicated) vulnerabilities into the first child and not the others.
Could you share your destination? It would be easier to confirm.
Could you share your destination? It would be easier to confirm.
The destination struct is the same as in my previous comment.
My point was, about two (seemingly) different issues that got solved by this fix:
Fixed with Release v2.10.1.
I have the following models:
And this is the struct where I want the data to be mapped to:
Notice the
Children
field refers to itself :component
instead ofmodel.Components
because I want children to have the same data as the parent, in this case, theVulnerabilities
.If I didn't want to populate the children's sub-relations, I could have done something like this:
And then make a query like this:
In order to get the children with its sub-entities, I know that I need to do a sub-query but I'm not sure if the QRM supports this.
I tried something like this:
But I can't make the mapper to fill the structs in the way I need it.