opalj / opal

https://www.opal-project.de
Other
51 stars 27 forks source link

Flushing Type Cache leads to Exceptions in AI #228

Open johannesduesing opened 2 weeks ago

johannesduesing commented 2 weeks ago

I've encountered an issue with the "Flush Type Cache" functionality introduced in #73 - this is closely related to what we found when we initially added that feature (see #75). I'm aware that we likely cannot do much about this, but i wanted to document the issue either way.

Problem I implemented a large-scale analysis that analyzes many projects in a row and flushes caches in-between to avoid filling up the heap space. The analysis runs project.get(ComputeTACAIKey) for every project. Occasionally this fails with the following exception

org.opalj.ai.InterpretationFailedException$$anon$1: InterpretationFailedException(
        domain=org.opalj.ai.domain.l1.DefaultDomainWithCFGAndDefUse@5720c48d with TheMethod(org.apache.commons.logging.LogFactory{ private static boolean implementsLogFactory(java.lang.Class) }),
        ai=class org.opalj.ai.BaseAI$,
        cause=org.opalj.ai.DomainException: unsupported Class { java.lang.Class forName(java.lang.String,boolean,java.lang.ClassLoader) },
        pc=33,
        operands=List(_ <: java.lang.ClassLoader[↦7;refId=104], int = 0, String("org.apache.commons.logging.LogFactory")[@29;refId=124]),
        registers=(0,_ <: java.lang.Class[↦-1;refId=102]),(1,int = 0),(2,_ <: java.lang.ClassLoader[↦7;refId=104]),(3,null)
)

The reason is that the method under analysis calls Class.forName(String, boolean, ClassLoader), and the following match statement in org.opalj.ai.domain.l1.ClassValues fails although it does have a specific case for this particular method descriptor.

methodDescriptor match {
    case `forName_String`                           => simpleClassForNameCall(pc, value)
    case `forName_String_boolean_ClassLoader`       => simpleClassForNameCall(pc, value)
    case `forName_String_Class`                     => simpleClassForNameCall(pc, value)
    case `forName_String_boolean_ClassLoader_Class` => simpleClassForNameCall(pc, value)
    case _ =>
        throw new DomainException(
            s"unsupported Class { ${methodDescriptor.toJava("forName")} }"
        )
}

This is turn is caused by the same issue we had with the test executions in #73 - ObjectType("java/lang/ClassLoader") is created once (see code below) and always used for comparisons in the above match-statement. However, after flushing the type cache, the next project under analysis will create a new ObjectType instance for the same FQN, yielding a different ID. Since comparison is ID-based, OPAL will conclude that the two instances represent different types and not match the appropriate case.

private object ClassValues  {
[..]
  final val forName_String_boolean_ClassLoader = {
      MethodDescriptor(
          ArraySeq(ObjectType.String, BooleanType, ObjectType("java/lang/ClassLoader")),
          ObjectType.Class
      )
  }
[..]
}

Fixing this Issue I'm not sure we want or can do something about this. The ultimate reason is that ObjectType("java/lang/Classloader") is not one of the predefined OTs (like ObjectType.String for example) and will not be kept when flushing the cache. The only way of dealing with this would be to collect all OTs that OPAL uses internally and that are not predefined (i.e., created via ObjectType("<FQN>")), and make them a predefined constant. A quick search yields over 100 occurrances of that constructor being used in OPAL, making this a major change that is likely not worth it. For me, re-running the failed projects one at a time solves the issue (since there is no cache-flushing involved), however this does not really scale and defeats the purpose of flushing the caches.

errt commented 2 weeks ago

I don't think the potential solution is as bad as you think it is. Yes, there are many uses of that constructor, but it seems that a) many are parts of tests, demos, or Hermes, thus not relevant for the case of flushing caches, and b) many use object types that also exist as predefined constants (which is fine, because they should just be resolved to the same predefined object, but of course on could exchange the (more expensive) method call with the constant). java.lang.ClassLoader definitely seems like a type we could use as a predefined constant anyway.

johannesduesing commented 2 weeks ago

Alright, i've your not completely against that i'll have a more detailled look into those occurrances and post a PR 👍