scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
230 stars 21 forks source link

mark in the bytecode forwarders to Java interface default methods #12752

Closed unkarjedy closed 1 year ago

unkarjedy commented 1 year ago

Scala compiler handles Java default methods differently from Java. Scala generates delegating methods that just invoke super.foo() while Java does not.

AFAIU this is done to support multiple inheritence in Scala. Though I don't know whether it's indeed required for java 8 interfaces with default methods, or it is a rudiment from old times.

Example:

public interface MyJavaInterface {
    String foo1();
    default String foo2() { return "2"; }
    default String foo3() { return "3"; }
}
class MyScalaClass extends MyJavaInterface {
  override def foo1(): String = "11"
  override def foo2(): String = super.foo2()
}

Notice that foo3 is not overridden in MyScalaClass. If you look into bytecode you will see that foo3 method was added by the compiler:

public class MyScalaClass implements MyJavaInterface {

  @Lscala/reflect/ScalaSignature;(bytes="\u0006\u0005!2A\u0001B\u0003\u0001\u0011!)Q\u0003\u0001C\u0001-!)\u0001\u0004\u0001C!3!)q\u0005\u0001C!3\u0009aQ*_*dC2\u000c7\u0009\\1tg*\u0009a!A\u0004=K6\u0004H/\u001f \u0004\u0001M\u0019\u0001!C\u0009\u0011\u0005)yQ\"A\u0006\u000b\u00051i\u0011\u0001\u00027b]\u001eT\u0011AD\u0001\u0005U\u00064\u0018-\u0003\u0002\u0011\u0017\u00091qJ\u00196fGR\u0004\"AE\n\u000e\u0003\u0015I!\u0001F\u0003\u0003\u001f5K(*\u0019<b\u0013:$XM\u001d4bG\u0016\u000ca\u0001P5oSRtD#A\u000c\u0011\u0005I\u0001\u0011\u0001\u00024p_F\"\u0012A\u0007\u0009\u00037\u0011r!\u0001\u0008\u0012\u0011\u0005u\u0001S\"\u0001\u0010\u000b\u0005}9\u0011A\u0002\u001fs_>$hHC\u0001\"\u0003\u0015\u00198-\u00197b\u0013\u0009\u0019\u0003%\u0001\u0004Qe\u0016$WMZ\u0005\u0003K\u0019\u0012aa\u0015;sS:<'BA\u0012!\u0003\u00111wn\u001c\u001a")

  ATTRIBUTE ScalaSig : unknown

  ATTRIBUTE ScalaInlineInfo : unknown

  // access flags 0x1
  public foo3()Ljava/lang/String;
   ...

  // access flags 0x1
  public foo1()Ljava/lang/String;
   ...

  // access flags 0x1
  public foo2()Ljava/lang/String;
   ...
}

Compare with java:

public class MyJavaClass implements MyJavaInterface{
    @Override public String foo1() { return null; }
    @Override public String foo2() { return MyJavaInterface.super.foo2(); }
}
// class version 55.0 (55)
// access flags 0x21
public class MyJavaClass implements MyJavaInterface {

  // compiled from: MyJavaClass.java

  // access flags 0x1
  public <init>()V
   ...

  // access flags 0x1
  public foo1()Ljava/lang/String;
   ...

  // access flags 0x1
  public foo2()Ljava/lang/String;
   ...
}

Feature

It would be great if there was a way to determine from the bytecode that method foo3 was automatically generated by the Scala compiler. Some meta information in any form would be helpful. It would be enough if this meta-data was added only with an extra compile flag, for a start.

Experiments

I checked whether foo3 contain flag ACC_SYNTHETIC in the bytecode and it does not. foo2 and foo3 have the same set of flags

Background

We use Scala 2.13 to develop Scala Plugin for IntelliJ. In the IntelliJ platform, there are some automated tests that verify proper Platform API usage. For example, they check that some platform methods marked as @ApiStatus.NonExtendable are not overridden by plugins. Or, methods that are marked @ApiStatus.OverrideOnly are not called by plugins directly. Both of these checks fail with some Scala Plugin code because delegating methods do both: they override super method and call super method. If there were some markers that allowed us to detect auto-generated methods we could patch platform tests to ignore them.

Example 1

Notice that SbtCompilationWatcher$.anon$1.beforeModuleRemoved(...) is reported because under the hood it invokes beforeModuleRemoved method. Even though in the source code we do not do this. image image

Example 2

Notice that we don't override com.intellij.openapi.actionSystem.DataContext#getData(com.intellij.openapi.actionSystem.DataKey<T>) explicitely, but still it's reported image image

lrytz commented 1 year ago

There's a long history to why the forwarders exist, see https://github.com/scala/scala/pull/5429 for example.

Changing the flags and adding ACC_SYNTHETIC would not affect JVM semantics, but it's not without risks, it could break other tools such as javac.

Would it be an option to identify the forwarders based on their bytecode instructions? The body is ALOAD 0; INVOKESPECIAL ...; RETURN if the parent is a Java interface and ALOAD 0; INVOKESTATIC ..$ ...; RETURN if the parent is a Scala trait.

unkarjedy commented 1 year ago

Would it be an option to identify the forwarders based on their bytecode instructions? The body is ALOAD 0; INVOKESPECIAL ...; RETURN if the parent is a Java interface and ALOAD 0; INVOKESTATIC ..$ ...; RETURN if the parent is a Scala trait.

That would be workaround-ish but could work. I will ask how difficult it would be to patch our byte-code-based tests to detect this pattern.

The body is ALOAD 0; INVOKESPECIAL ...; RETURN if the parent is a Java interface and ALOAD 0; INVOKESTATIC ..$ ...; RETURN if the parent is a Scala trait.

To be more precise, not just return but one of the following: return areturn dreturn freturn ireturn lreturn

unkarjedy commented 1 year ago

An alternative solution from @sjrd is to consider using tasty-query. It can read Scala 2 Pickle information which presumably contains the information we are interested in.

Usage example: https://github.com/scalacenter/scala-debug-adapter/blob/main/modules/scala-3-step-[…]la/debugadapter/internal/stepfilter/ScalaStepFilterBridge.scala

Some preliminary test:

import tastyquery.Contexts.Context
import tastyquery.jdk.ClasspathLoaders
import tastyquery.{Contexts, Symbols}

import java.nio.file.Path

@main
def main(): Unit = {
  val classpathList: Seq[Path] = Seq(
    Path.of("...\\untitled2\\target\\scala-2.13\\classes")
  )
  val classpath = ClasspathLoaders.read(classpathList.toList)

  given contexts: Context = Contexts.init(classpath)

  val myScalaClassSymbol: Symbols.ClassSymbol = contexts.findTopLevelClass("MyScalaClass")
  val declarations = myScalaClassSymbol.declarations
  declarations.foreach(println)
}

outputs:

symbol[MyScalaClass.fooWithoutDefaultString]
symbol[MyScalaClass.fooWithoutDefaultInt]
symbol[MyScalaClass.<init>]

Which doesn't contain any methods in format fooDefault*

lrytz commented 1 year ago

I don't think relying on the pickle information will work, because that information is collected soon after type checking, but the forwarders are generated only very late in the mixin phase...

sjrd commented 1 year ago

The trick is to consider that if you can't find a method in the pickle, it's compiler-generated ;)

unkarjedy commented 1 year ago

but the forwarders are generated only very late in the mixin phase...

Meaning that they will not be available in declarations? (as in the example above) Isn't this information enough to check whether some method is defined in the class or not?

lrytz commented 1 year ago

OK, that can work :-) Note that there are other such compiler-generated methods, e.g. bridges.

It might be difficult to associate a classfile names with the corresponding classes in the pickle (nested classes, anonymous classes).

unkarjedy commented 1 year ago

Thanks for both proposals. Though the solution with analyzing method body feels more workaround it might be simpler to implement and more practical. We will discuss both solutions with the plugin verifier developers.

lrytz commented 1 year ago

I'm closing this ticket as it's not a bug, but you can still comment if there are follow-ups, or we can move the discussion elsewhere.

adpi2 commented 1 year ago

It might be difficult to associate a classfile names with the corresponding classes in the pickle (nested classes, anonymous classes).

This is implemented here. It supports nested classes but not anonymous or local ones and that I agree would be hard to solve (not impossible though).

Note that there are other such compiler-generated methods, e.g. bridges.

Here is the full list of compiler-generated or synthetic methods that are skipped by the scala-debug-adapter:

Either we decide to skip them here or in the TASTy-based Scala3StepFilter. The tests are in StepFIlterTests. We also have a Scala2StepFilter that is based on the IntelliJ Scala 2 unpickler, but I like the TASTy version better, it is simpler, more robust and supports Scala 2.13.

All of this to say that maybe we should extract parts of Scala3StepFilter into a library that can find a Scala symbol from its Java bytecode signature. This is something that we can discuss during the Tooling Summit. @unkarjedy

This could also have more use cases such as pretty printing of stack traces.

lrytz commented 1 year ago

While we're at it, another source of mismatches between classfiles and the symbol table is methods that are renamed to be made public

class C { def f = C.g }
object C { private def g = 1 }

Class C$ has method C$$g.