Closed johannesduesing closed 1 week ago
Will look into it. Does it actually work if you load the JDK implementations?
def main(args: Array[String]): Unit = {
val libFiles = Java17Framework.AllClassFiles(Seq(RTJar))
val projectFiles = Java17Framework.AllClassFiles(Seq(new java.io.File("Demo.class")))
val project = Project(projectFiles, libFiles, libraryClassFilesAreInterfacesOnly = false)
val cg = project.get(CHACallGraphKey)
println(cg.reachableMethods().exists(_.method.name == "println"))
}
This will work and produce result true
- however, it also takes way longer. The following will not work:
val project = Project(new java.io.File("Demo.class"), RTJar)
val cg = project.get(CHACallGraphKey)
println(cg.reachableMethods().exists(_.method.name == "println"))
Note that we do model (and I think this should work without the JDK implementations) calls to Thread.start
, which is what you actually invoke when you want to spawn the thread. I don't think you should ever call Thread.run
, the API documentation even says so: https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.html#run()
I don't even know why Thread
implements Runnable
in the first place.
Ah sorry that's my bad, i made a mistake building the illustrative example. In the original code start
is used as intended. I just changed the code under analysis to use start
instead of run
- the same behavior persists. It works only if i fully load the JRE with implementations.
Are you sure it works with the JDK loaded and you don't just get true
because some println method is reachable from somewhere else? From what I saw in trying to reproduce the issue, it seems this is never resolved in the ThreadRelatedCallsAnalysis
because (for reasons that I don't know) the value of the receiver is not precisely known.
You are right, it in fact does not work as expected even when loading the JDK. It does mark the right println
method as reachable, but for the wrong reasons - the list of callers does not contain the expected lambda method, but only internal JDK methods from java.security
or sun.security
. So regardless of the implementations being available or not, this seems to be an actual bug in the Thread Handling for call graphs.
Not sure. I think the ThreadRelatedCallsAnalysis
is implemented sensibly. Question would be why the receiver type here is deemed not precise by the abstract interpretation. And that is because the domain that is picked by default (l0.PrimitiveTACAIDomain
) doesn't manage to get this right. Even l1.DefaultDomainWithCFGAndDefUse
still doesn't work, but with l2.DefaultPerformInvocationsDomainWithCFGAndDefUse
it does:
import org.opalj.ai.Domain
import org.opalj.ai.domain.RecordDefUse
import org.opalj.ai.fpcf.properties.AIDomainFactoryKey
import org.opalj.br.analyses.Project
import org.opalj.tac.cg.CHACallGraphKey
object SimpleCg {
def main(args: Array[String]): Unit = {
val project = Project(new java.io.File("path/to/Demo.class"))
val domain = Class.forName("org.opalj.ai.domain.l2.DefaultPerformInvocationsDomainWithCFGAndDefUse").asInstanceOf[Class[Domain with RecordDefUse]]
project.updateProjectInformationKeyInitializationData(AIDomainFactoryKey) {
case None => Set(domain)
case Some(requirements) => requirements + domain
}
val cg = project.get(CHACallGraphKey)
println(cg.reachableMethods().exists(_.method.name == "println"))
}
}
prints true
for me. Could we improve at least the l1 domain to be better at this? Maybe. But the problem doesn't seem to be the call graph analysis. Or do you see an improvement for the ThreadRelatedCallsAnalysis
that could make it work even in the case of a non-precise receiver value according to the underlying abstract interpretation?
Ah I see, thanks for the explanation! That actually fixes the issue in both my example and the original project i was having trouble analyzing 🥳
I took some time to debug the ThreadRelatedCallsAnalysis
and related implementations, and i really don't know how we would improve that, so i think it is fine as it stands.
To me, it feels a little off that OPAL in its default configuration produces an unsound CG even when using CHA. In my conceptual understanding of CHA - not taking into account any implementation decisions made for OPAL - it would be possible to find this calling relation given the added information that Thread.start
will invoke Runnable.run
, since the Lambda is represented as a synthetic class implements Runnable
. However, i do not see how we would change to current implementation to represent that, as it does rely on AI information on the actual parameter value being passed to the constructor of Thread
, regardless of which CG algorithm has been chosen.
Since there is no actual bug here and i can proceed with just selecting the L2 domain for my own analysis, i'd be fine with closing this issue - thanks for your help @errt !
You are right that with CHA, we would hope for more sound call graphs. It could be done here, e.g., just adding a call to any Runnable.run
method (which is kind of what CHA should do, right?), but the current implementation of the ThreadRelatedCallsAnalysis
doesn't do that in favor of higher precision.
Actually, I think this might be because the analysis predates Unimocg's type iterator interface and should probably be adapted to use that so that the precision/soundness always stays in line with the chosen CG algorithm. I'll keep the issue open until we looked into that.
I stumbled upon what i thought was a bug when building callgraphs for a Java project of mine. Now that i looked into it, i'm not so sure if it is intended behavior or not. It boils down to the following example:
Code to Analyze (
Demo.java
)Analysis Implementation
This analysis will print
false
, i.e. the Call Graph will not find any calls toprintln
. The reason (to me) seems to be, that OPAL without the JRE implementation cannot relateThread.run
toRunnable.run
. Note that this is not a Lambda-Issue, the same thing happens with actual implementations ofRunnable
. Is this intended behavior? And would the only workaround be to load the entire JRE with implementations as library files?