Open judovana opened 2 years ago
If it indeed was an inner class, I'd say A.B.java
is a more accurate representation. A$B
is actually different from A.B
in class file attributes, that the relation for A.B
is recorded in InnerClasses
attribute while that for A$B
is not.
hmm. Then it is already wrongly substituting $ by dot. In the binaries, I see only $ signs in inner classes. No dots nowhere. If dot would be correct, then name clash of pkg1.pkg2.clazz2 x pkg1.clazz1. innerclazz1 if pkg2 name would be same as clazz2 name and clazz2 name would be same as innerclazz1 (where actualy x$y is perfectly valid class name and you can compile inner classes as standalone chunks if necessary, but ypu must not replace $ by dot.)
-- Mgr. Jiri Vanek @.***
---------- Původní e-mail ---------- Od: liach @.> Komu: leibnitz27/cfr @.> Datum: 11. 4. 2022 2:05:30 Předmět: Re: [leibnitz27/cfr] inner classes should be separated by "$" not by "." dot when decompiled alone (#263) "
If it indeed was an inner class, I'd say A.B.java is a more accurate representation. A$B is actually different from A.B in class file attributes, that the relation for A.B is recorded in InnerClasses attribute while that for A$B is not.
— Reply to this email directly, view it on GitHub (https://github.com/leibnitz27/cfr/issues/263#issuecomment-1094417388), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAWFCS3K3ETWZZRZT4K3OB3VENUEHANCNFSM5BZ6OAVA) . You are receiving this because you authored the thread. Message ID: < @.***> "
If CFR decompiles the inner class only, it creates it as
A.B.java
. Which is wrong, and can not be compiled. If CFR goes correctly, and stores them asA$B.java
, then this inner class can be in many cases compiled, and directly used.
While it may be true that the decompiled inner classes are more likely to be compilable following the A$B
convention doing so breaks the inner-outer relationship of B
with A
. Recompiling the class alone would not emit NestHost
or an InnerClasses
attribute, which would break access in some cases.
If you want to recompile decompiled code (already a messy task) you do not want to do it this way.
If it indeed was an inner class, I'd say
A.B.java
is a more accurate representation.
If you're decompiling an application with inner classes, you should not generate any individual files for inner classes at all. They should be output as part of the outer class. In these cases A.B
is the appropriate source level construct you want to use so that your decompilation is as accurate as possible.
If you are explicitly decompiling the inner class by itself without any outer class context it should be reasonable to assume that source-level constructs are not the target for what is deemed "accurate". Without any context you also cannot de-sugar inner class patterns used to reference the outer class (and references used by the outer to access private members in the inner class). Since there isn't sufficient data to fully reconstruct the source-level constructs across the entire inner class its best to keep things uniform in representation. In this case, including the A$B
naming convention along.
A$B
is actually different fromA.B
in class file attributes, that the relation forA.B
is recorded inInnerClasses
attribute while that forA$B
is not.
What are you talking about? (This sounds ambiguous / needs elaboration)
Source:
class A {
public static void main(String[] args) {
System.out.println("Hello " + new B().get());
}
private static class B {
private String get() { return "text"; }
}
}
Binary:
Plop a class in a hex editor and look at the values its pointing to. There are 4 pieces to an entry in the InnerClass
attribute.
title | description |
---|---|
Inner class info | A CP_CLASS reference inner class (self class). |
Outer class info | A CP_CLASS reference to the outer class. |
Inner name | A CP_UTF8 reference to the name of the class (Akin to Class#getSimpleName ), or 0 if the inner class is anonymous. |
Inner access | Flags used by the compiler for access information, assuming source code is not available |
There is no .
value anywhere in any of the references. The .
character is used at the source level, not the bytecode level.
Well written Matt. TY!
My point is that both A$B.java
and nested/inner class B
in A.java
compile to A$B.class
. The feature that distinguish them is that in the nested/inner class version, there is in InnerClasses
an entry describing A$B
as a nested/inner class of A
, while such entry does not exist for the A$B.java
version. Per JVMS
In addition, the
constant_pool
table of every nested class and nested interface must refer to its enclosing class, so altogether, every nested class and nested interface will haveInnerClasses
information for each enclosing class and for each of its own nested classes and interfaces.
Imo if we do decompile it to A$B.java
, we need a comment to clarify that this is intended to be an inner class but the outer class could not be found.
That sounds like a correct thing to do.
My point is that both
A$B.java
and nested/inner classB
inA.java
compile toA$B.class
. The feature that distinguishes them isInnerClasses
which describes the relationship of innerB
to outerA
inside the classA$B
. No such entry exists for theA$B.java
version
I fail to see how this is relevant. I understand the behavior of the InnerClasses
attribute but we're talking about decompiling just a single inner class without the context of the outer class. At this point it is already impossible to recover that relationship due to the fact the outer class is missing. This is a failure of the "decompile recompile" approach. You cannot in this state ever regenerate a proper InnerClasses
attribute.
The argument for this naming convention comes down to "what is the best for the situation if we are missing context?". It is my opinion, which anyone is free to disagree with (more perspectives leads to better discussion and thus a better solution), that in this specific situation A$B
is the better name. It matches the true name of the class. Again, should you attempt to decompile the contents of a Java application, you should not ever have any separate files for inner classes.
Imo if we do decompile it to
A$B.java
, we need a comment to clarify that this is intended to be an inner class but the outer class could not be found.
:100:
The warning should be in any decompiled inner class regardless of the file name if there is missing context.
Yes. Your solution of using A$B.java
as file name and add comment warning should be how such standalone inner class files be handled.
for the case of decompiling an isolated inner/nested class, i think there's been a CFR switch from immemorial times that disables nested class resugaring. so CFR has always been able to do what you want.
the feature of reverting to this mode for a single class that is missing its parents (at least one of them i suppose?) might not be desired or carry is own weight. when a class is missing some but not all of its parents, some resugaring might still be working and be useful. in this context, partial disablement of resugaring to decompile to something like A.B$C
starts to look unnecessarily complex and not really useful.
but the real complexity increase might come into play -IDK, it's an hypotheses- when decompiling other classes. a decision whether to resugar the binary name A$B
will have to be deferred until A and B and their relationship has been analyzed. if that information dependency already exists for other reasons, then yes, this feature is simple to implement. otherwise the inclusion of this dependency can go as far as requiring a new pass (very unlikely), but the point is that the cost of the feature might outweigh its questionable value. and there are no free rides or infinite resources: in general, for each feature implemented others will be left out.
When CFR decompiles inner classes, it stores them with dot as delimiter.
Are compiled as
A.class and A$B.class
and as those they are also seen by JVM. If CFR decompiles the inner class only, it creates it as A.B.java. Which is wrong, and can not be compiled. If CFR goes correctly, and stores them as A$B.java, then this inner class can be in many cases compiled, and directly used.Aligned with naming, the used name should be kept. So instead of current
class A.B {...
theclass A$B {...
should be used