Closed aaaayush-n closed 5 days ago
Sometimes the seems to get stuck... It returns no result and runs for a vey long time, I am running queries using cli through golang os.Exec(cmd)
What queries are you running, and against what project database? How long does the example query take at present, and how long do you want it to take?
crc32>
<format>Web ARChive GZ</format>
<private>true</private>
<file name="GDELT-20240425032120-24509.w
@smowton I am running custom queries on a go project database.I have lot's and lot's of queries running one after the other. Some queries take a couple of seconds and some take a few minutes! I've noticed some queries also get stuck(at least that's what I think is happening) which even after a long time of running don't give any result(ie. query doesn't complete) Ideally I want the queries to be executed as fast as possible because I am running hundreds of queries one after the other using a script. The entire process takes hours sometimes. example:
/**
* @id go/get-location
* @kind problem
* @problem.severity recommendation
*/
import go
class Avoid extends string{
Avoid() {this=["nil","true","false"]}
}
class MyFunc extends Expr{
TypeDecl typeDecl;
MethodSpec methodSpec;
TypeSpec typeSpec;
MethodDecl methodDecl;
StarExpr starExpr;
File file;
ParameterOrResultDecl prDecl;
SelectorExpr selExpr;
FieldDecl fieldDecl;
StructTypeExpr structTypeExpr;
Ident ident;
Ident ident2;
VarDecl varDecl;
FunctionName funcName;
FuncTypeExpr funcTypeExpr;
ConstDecl constDecl;
ValueSpec valueSpec;
MethodDecl getMethodLogic(string functionName,string ifcDirPath){
methodDecl.getName()=functionName
and methodDecl.getFile().toString().replaceAll(methodDecl.getFile().getBaseName().toString(), "").substring(0, methodDecl.getFile().toString().replaceAll(methodDecl.getFile().getBaseName().toString(), "").length()-1)=ifcDirPath
and
result=methodDecl
}
Ident getConstDeclIdent(MethodDecl methodLogic){
ident.getAPrimaryQlClass()="ConstantName"
and not exists(Avoid avoid | ident.toString()=avoid )
and ident.getEnclosingFunction()=methodLogic
and(ident2.getParent()=valueSpec or ident2.getParent()=constDecl)
and
result=ident2
and ident.toString()=ident2.toString()
}
}
from
string functionName
,string ifcDirPath
,string ifcName
,MyFunc func
,MethodDecl methodLogic
,Ident identConstDecl
,string allLocs
,string rcvrType
,Ident identIfcName
,ResultVariableDecl rvd
,FuncDecl funcDecl
,StructLit strLit
,Ident wantIdentName
,IntLit intLit
where functionName="CreateOrder"
and ifcDirPath="/Users/my-username/go/src/github.com/Playground/order-management-service/app/service"
and ifcName="OrderService"
and methodLogic=func.getMethodLogic(functionName,ifcDirPath)
and identIfcName.toString()=ifcName
and identIfcName.getAPrimaryQlClass()="TypeName"
and identIfcName.getParent()=rvd
and identIfcName.getFile().toString().replaceAll(identIfcName.getFile().getBaseName().toString(), "").substring(0, identIfcName.getFile().toString().replaceAll(identIfcName.getFile().getBaseName().toString(), "").length()-1)=ifcDirPath
and funcDecl=identIfcName.getEnclosingFunction()
and strLit.getEnclosingFunction()=funcDecl
and wantIdentName.getParent()=strLit
and rcvrType= methodLogic.getReceiverBaseType().toString()
and rcvrType=wantIdentName.toString()
and identConstDecl=func.getConstDeclIdent(methodLogic)
and if identConstDecl.getParent()=intLit.getParent()
then identConstDecl.getParent()=intLit.getParent()
and allLocs=identConstDecl.getParent().getParent().getLocation().toString()+"#"+identConstDecl.getFile().getPackageName()
else allLocs= identConstDecl.getParent().getLocation().toString()+"#"+identConstDecl.getFile().getPackageName()
select allLocs,""
This query takes input of function name, its interface name and package name and returns all the Declaration of the constants that are used in the function. sample go code:
func (s *OrderServiceImpl) CreateOrder(ctx context.Context, req *proto.CreateOrderRequest) (*proto.CreateOrderResponse, error) {
orderId := 1234
riderAssignResp, err := s.RiderAssignService.AssignRiderToOrderId(ctx, &proto.RiderAssignReq{
OrderId: int64(orderId),
ItemId: req.ItemId,
RiderStatus: proto.RiderStatus_available,
})
if err != nil || riderAssignResp == nil {
return nil, errors.New("rider assign failure")
}
if req.Address == name {
print("req.address==name")
}
return &proto.CreateOrderResponse{
Success: true,
}, nil
}
What it should return is the location of the constants used in this function ie.
const (
RiderStatus_available RiderStatus = 0
RiderStatus_pick_up RiderStatus = 1
RiderStatus_drop RiderStatus = 2
)
and
const name = "name"
FYI in general it is much better to use one query command passing multiple .ql
files than to run them in sequence -- this is because that enables us to share common parts of the needed computation, like the go
QL library. However from what you are describing it sounds like some of your queries have specific problems -- I will take a look at that now.
OK, first point: class fields (e.g. TypeDecl typeDecl;
above) should be thought of like fields in a table.
For example, if I declare these classes:
class CatOrDog extends string { CatOrDog() { this = ["cat", "dog"] } }
class RedOrBlue extends string { RedOrBlue() { this = ["red", "blue"] } }
class TrueOrFalse extends string { TrueOrFalse() { this = ["true", "false"] } }
class HasFields extends TrueOrFalse {
CatOrDog catOrDog;
RedOrBlue redOrBlue;
}
Then the meaning of the HasFields
class is like building the table:
this | catOrDog | redOrBlue
---------------------------
true | cat | red
true | cat | blue
true | dog | red
true | dog | blue
false | cat | red
false | cat | blue
false | dog | red
false | dog | blue
This is simply all possible combinations of this
(which is of type HasFields
, which extends TrueOrFalse
) and the two explicit fields.
Normally we use a characteristic predicate -- which has syntax like a Java constructor -- to restrict which instances of the class (this
) can relate to which field values:
class HasFields extends TrueOrFalse {
CatOrDog catOrDog;
RedOrBlue redOrBlue;
HasFields() { if this = "true" then catOrDog = "cat" else catOrDog = "dog" }
}
That restricts the table as follows:
this | catOrDog | redOrBlue
---------------------------
true | cat | red
true | cat | blue
false | dog | red
false | dog | blue
Notice the redOrBlue
field is still not constrained, so we still get both values. It is quite common for all of a class' fields to be functionally determined by this
-- meaning for any given this
there is at most one field valuation possible:
class HasFields extends TrueOrFalse {
CatOrDog catOrDog;
RedOrBlue redOrBlue;
HasFields() { if this = "true" then (catOrDog = "cat" and redOrBlue = "red") else (catOrDog = "dog" and redOrBlue = "blue") }
}
Resulting in the table:
this | catOrDog | redOrBlue
---------------------------
true | cat | red
false | dog | blue
Then class member predicates often refer to the fields somehow, like
class HasFields extends TrueOrFalse {
...
string getPet() { result = "My state is " + this + " and I have the following pet: " + catOrDog }
}
In the case of your MyFunc
class, we have lots of fields and no characteristic predicate -- that means the class represents the cartesian product (every possible combination) of Expr
, the type of this
, with all those fields, which is a huge table. The class member predicates getMethodLogic
and getConstDeclIdent
also don't refer to this
, and the fields they do refer to only ever occur in one of the member predicates, so they don't actually need to be declared inside a class at all.
We can therefore simplify this quite a lot by moving the predicates outside a class, removing MyFunc
and all of its fields, and using the exists
statement to introduce local temporaries where we need them:
MethodDecl getMethodLogic(string functionName, string ifcDirPath) {
exists(MethodDecl methodDecl |
methodDecl.getName() = functionName and
methodDecl
.getFile()
.toString()
.replaceAll(methodDecl.getFile().getBaseName().toString(), "")
.substring(0,
methodDecl
.getFile()
.toString()
.replaceAll(methodDecl.getFile().getBaseName().toString(), "")
.length() - 1) = ifcDirPath and
result = methodDecl
)
}
Ident getConstDeclIdent(MethodDecl methodLogic) {
exists(Ident ident, Ident ident2, ConstDecl constDecl, ValueSpec valueSpec |
ident.getAPrimaryQlClass() = "ConstantName" and
not exists(Avoid avoid | ident.toString() = avoid) and
ident.getEnclosingFunction() = methodLogic and
(ident2.getParent() = valueSpec or ident2.getParent() = constDecl) and
result = ident2 and
ident.toString() = ident2.toString()
)
}
Notice the field methodDecl
has become a local temporary in getMethodLogic
and the fields ident
, ident2
, constDecl
and valueSpec
have become temporaries in getConstDeclIdent
.
Actually the result of getMethodLogic
is also a MethodDecl
and is exactly the same as methodDecl
, so we can just use result
directly:
MethodDecl getMethodLogic(string functionName, string ifcDirPath) {
result.getName() = functionName and
result
.getFile()
.toString()
.replaceAll(methodDecl.getFile().getBaseName().toString(), "")
.substring(0,
result
.getFile()
.toString()
.replaceAll(methodDecl.getFile().getBaseName().toString(), "")
.length() - 1) = ifcDirPath
}
And finally, the replaceAll
logic is just trying to get the directory name where the method is declared, so we can use getParentContainer
to do this better:
MethodDecl getMethodLogic(string functionName, string ifcDirPath) {
result.getName() = functionName and result.getFile().getParentContainer().toString() = ifcDirPath
}
Now let's look at getConstDeclIdent
. We can make a few improvements:
getAPrimaryQlClass
returning ConstantName
, we can see the only relevant class is the class named ConstantName
. getAPrimaryQlClass
is a debugging feature, so it's better to use the class directly: ident instanceof ConstantName
.not exists(Avoid avoid | ident.toString() = avoid)
is equivalent to just not ident.toString() instanceof Avoid
. Since this is the only use of Avoid
we can inline it, and we should prefer Ident.getName
to Ident.toString
(they have the same value): not ident.getName() = ["true", "false", "nil"]
ident2
is exactly result
, so we can just remove ident2
and use result
directly.constDecl
and valueSpec
are only used once, so we should use instanceof
instead of a field or variable -- in particular it's a problem when a variable is only used in one half of an or
statement; in that case it should moved to a closer-scoped exists
for just their side of the or
, or as in this case eliminated entirely.So here's the improved version:
Ident getConstDeclIdent(MethodDecl methodLogic) {
exists(Ident ident |
ident instanceof ConstantName and
not ident.getName() = ["true", "false", "nil"] and
ident.getEnclosingFunction() = methodLogic and
(result.getParent() instanceof ValueSpec or result.getParent() instanceof ConstDecl) and
ident.getName() = result.getName()
)
}
Finally let's look at the top-level query:
functionName
is only used once, so we can remove it and remove the functionName
parameter from getMethodLogic
.getAPrimaryQlClass() = "TypeName"
can be replaced with an instanceof
check.rvd
is only used once -- again, use instanceof
.ifcName
is unused; remove it.replaceAll
logic again can use getParentContainer
.identConstDecl.getParent() = intLit.getParent()
inside the if
is unnecessaryintLit
is only used once, so again let's use instanceof
.if
statement using some extra from
variables.Here's the complete query after just the small, local cleanups:
/**
* @id go/get-location
* @kind problem
* @problem.severity recommendation
*/
import go
MethodDecl getMethodLogic(string ifcDirPath) {
result.getFile().getParentContainer().toString() = ifcDirPath
}
Ident getConstDeclIdent(MethodDecl methodLogic) {
exists(Ident ident |
ident instanceof ConstantName and
not ident.getName() = ["true", "false", "nil"] and
ident.getEnclosingFunction() = methodLogic and
(result.getParent() instanceof ValueSpec or result.getParent() instanceof ConstDecl) and
ident.getName() = result.getName()
)
}
from
string ifcDirPath, MethodDecl methodLogic, Ident identConstDecl, string allLocs, string rcvrType,
Ident identIfcName, FuncDecl funcDecl, StructLit strLit, Ident wantIdentName,
AstNode identConstDeclParent, AstNode getLocFromNode
where
methodLogic = getMethodLogic(ifcDirPath) and
identIfcName instanceof TypeName and
identIfcName.getParent() instanceof ResultVariableDecl and
identIfcName.getFile().getParentContainer().toString() = ifcDirPath and
funcDecl = identIfcName.getEnclosingFunction() and
strLit.getEnclosingFunction() = funcDecl and
wantIdentName.getParent() = strLit and
rcvrType = methodLogic.getReceiverBaseType().toString() and
rcvrType = wantIdentName.toString() and
identConstDecl = getConstDeclIdent(methodLogic) and
identConstDeclParent = identConstDecl.getParent() and
(
if identConstDeclParent.getAChild() instanceof IntLit
then getLocFromNode = identConstDeclParent.getParent()
else getLocFromNode = identConstDeclParent
) and
allLocs =
getLocFromNode.getLocation().toString() + "#" + identConstDecl.getFile().getPackageName()
select allLocs, "Alert message"
For me this actually completes quite quickly on a large Go database -- by just simplifying away logic, especially the regex-replacement logic, we have helped the CodeQL optimiser to form a better query plan, that is the order that different constraints are evaluated, in order to arrive at an answer in an acceptable time.
However I'm pretty sure this query could be expressed better -- but at the moment I'm struggling to understand what the query intends to do. Could you give a short example of what this query is supposed to understand? Why do we care about things being in the same directory -- do we actually just mean in the same package? What about the struct literal, what is the significance of that?
This issue is stale because it has been open 14 days with no activity. Comment or remove the Stale
label in order to avoid having this issue closed in 7 days.
This issue was closed because it has been inactive for 7 days.
Description of the issue Why does query execution take such a long time? It takes a few seconds on average for some of the complex queries What are the ways I can make my queries run faster?