Open moshec2 opened 2 years ago
How do you define "forever"? 10 minutes? 1 hour? 10 hours? 1 day?
I've been running codeql queries for over 24 hours on huge databases (Mozilla Firefox).
It also looks like there is progress, you first have "iteration 315" and later "iteration 318".
(Of course it might still be that there is a bug, but I'm not familiar with the ModulusAnalysis
class)
The iteration number keeps increasing forever. I executed one (relatively simple) query and it did not finish for more than 3 days. Does it make sense?
Greetings, thanks for reaching out to us with this question. It certainly is possible for queries to take a very long time on large databases - perhaps even days if the database is sufficiently large. Seeing predicates take hundreds of iterations is also not entirely uncommon, especially in some of the more complex parts of the data flow library.
Some common ways of speeding up our evaluation times include increasing the --ram
option passed to CodeQL (more is always better, though definitely leave at least a few GB of your RAM for your operating system and other programs that may be running) and the --threads
that CodeQL will use (here, you'll only see improvements up to around 8 threads and even then only on certain queries/DBs since not all evaluations can be parallelised).
It should not be possible for any CodeQL query to take forever - the CodeQL engine should refuse to evaluate things that won't provably terminate. However, it's possible (although unlikely) that there may be a bug and your query will genuinely never terminate. If the above suggestions don't help your query get a result, I'd be happy to look into this further for you if you could provide us with the database you are using and what query you are trying to evaluate.
This issue is stale because it has been open 14 days with no activity. Comment or remove the Stale
label in order to avoid having this issue closed in 7 days.
This issue was closed because it has been inactive for 7 days.
Hi,
I have noticed that even if I execute the following simple query:
import java
import semmle.code.java.dataflow.DataFlow
from DataFlow::Node n
select n
I get the same behavior. (I did let it run for more than 3 weeks and it did not return)
I am pretty sure it has to be some bug in the DataFlow standard library.
The log snippet at the top does not indicate looping (the varying row counts is an indication that progress is being made). However, certain types of bugs in the QL libraries related to ModulusAnalysis
could conceivably result in an attempt to enumerate all of int
(which would certainly look like an infinite loop). I'd be curious to have a look at the log output in more detail (in particular the parts of the log that presumably repeat, which would be after the above quoted snippet), and to see whether I can reproduce locally. Is the project open source, such that I could try to reproduce locally? If it something inside ModulusAnalysis
that appears to never finish, then the most likely cause is some particularly unfortunate code pattern in a single method that somehow makes the library calculate a huge amount of expression-equals-value-modulo-constant facts. Determining what that code pattern is would be useful such that we potentially could fix the QL to avoid degeneracy.
Unfortunately, this is a closed source project, so I cannot share the source code / database.
I tinkered a bit with the codeql standard library, and I have noticed some interesting behaviors which may help to identify the root cause:
In this qll file: https://github.com/github/codeql/blob/main/java/ql/lib/semmle/code/java/dataflow/internal/DataFlowNodes.qll When I comment out lines 16 and 466 - 472:
// DataFlowImplCommon::forceCachingInSameStage() and
...
/*
private class SummaryPostUpdateNode extends SummaryNode, PostUpdateNode {
private Node pre;
SummaryPostUpdateNode() { FlowSummaryImpl::Private::summaryPostUpdateNode(this, pre) }
override Node getPreUpdateNode() { result = pre }
}
*/
Suddenly the above query (from DataFlow::Node n select n
) returns results!
After that change, When I query for RemoteFlowSource
(which inherits DataFlow::Node
), it gets stuck again. And again, when I comment out some of the classes which extend RemoteFlowSource in https://github.com/github/codeql/blob/main/java/ql/lib/semmle/code/java/dataflow/FlowSources.qll , it suddenly returns results.
Another interesting fact I have noticed is: Without commenting out those parts (when the query does get stuck), it always has 23+ execution stages, and it always gets stuck 1 stage before the last one (22/23, 23/24, ...). When there are less stages (usually 13 after I comment out those lines), it always works.
Is there something else I can test to help you identify the issue? Can I somehow debug what happens (e.g. to make codeql print which code segments are currently getting processed)?
Thanks for your help!
Those facts are unfortunately somewhat arbitrary and don't reveal much - it's only saying that when you don't evaluate the part that's broken then it doesn't appear broken. The log snippet you originally posted is more useful, except the snippet above was only after 115s of eval. We'd need to look at the end of the log from the point where it really starts to "loop". The log can be augmented with additional debug info through the option "Code QL > Running Queries: Debug" in VSCode.
Hi @aschackmull
I have found the root cause of the issue. You can reproduce the issue by building a database containing a single class with the following code:
class Class {
void func() {
while(true) {
int i = i - 1;
}
}
}
A class in my project contains this code pattern.
This class does not actually compile (variable value might not have been initialized
). Despite that, the project successfully builds because this is a legacy class that is not actually in use anymore.
Although the class has failed to compile, CodeQL still indexed it (which I personally think is a good thing), and this specific code pattern caused some loop / recursion (probably in the SSA module, but I am not sure) which just keeps running forever. That's why I was not able to execute any queries which (directly or indirectly) uses the SSA module.
I am not sure if it's considered a bug or not (because the code does not actually compile), but I do think it should be fixed.
Thanks for the minimal test case -- candidate fix: https://github.com/github/codeql/pull/8155
@smowton Wouldn't ignoring those files cause queries to miss out some results sometimes?
The files have compile errors, so I don't think it would be important
This issue is stale because it has been open 14 days with no activity. Comment or remove the Stale
label in order to avoid having this issue closed in 7 days.
This issue is stale because it has been open 14 days with no activity. Comment or remove the Stale
label in order to avoid having this issue closed in 7 days.
I have built a CodeQL database for my Java project. I am able to execute simple queries on it, but when I attempt to execute a DataFlow query, The execution gets stuck in an infinite loop (probably in the
ModulusAnalysis::phiModulusRankStep
method) and never finishes. The log for such a query looks like this:It continues like this forever.