Closed bigfatbrowncat closed 10 years ago
isAnonymousClass: should be pretty easy. Just see if the name ends with a "$" followed by one or more digits.
isLocalClass: as above, except the name will end in "$" followed by one or more digits, followed by a Java identifier
getGenericInterfaces: to get this, you'll need to check if Class.vmClass.addendum.signature is non-null. If it is, it will be a byte array containing a null-terminated string -- you can make it a String with e.g. new String(signature, 0, signature.length - 1). You can parse that to generate a Type array. It will take some work, but the raw signature is already there.
followed by a Java identifier
Is there any way to check a valid identifier? Or I should implement it by myself? I think actually, that I shouldn't check for a valid identifier. I should just search for any non-digit after some digits after $ sign. Am I right?
You can use Character.isJavaIdentifierStart and Character.isJavaIdentifierPart: http://docs.oracle.com/cd/E19798-01/821-1841/bnbuk/index.html
It looks like Character.isJavaIdentifierStart and Character.isJavaIdentifierPart are absent yet too :)
Well, it will be present in the Android build, but you're right that it won't be available in the Avian build unless we add it. They're pretty simple, though, so they should be easy to add.
It looks like Character.isJavaIdentifierStart and Character.isJavaIdentifierPart are absent
It should not be too hard to implement them by extending the CharacterMatcher
class. The only thing unclear to me is how to do that without having to make CharacterMatcher public (it is not part of the official class path library, after all).
We could move that class to the avian package and make it public, which is what we've done with other non-standard classes.
We could move that class to the avian package and make it public, which is what we've done with other non-standard classes.
That is a good idea! AFAICT isJavaIdentifierStart()
would correspond to new CharacterMatcher("[A-Za-z_]")
and isJavaIdentifierPart()
to new CharacterMatcher("[A-Za-z_0-9]");
.
On Wed, 1 Oct 2014, dscho wrote:
We could move that class to the avian package and make it public, which is what we've done with other non-standard classes.
That is a good idea! AFAICT
isJavaIdentifierStart()
would correspond tonew CharacterMatcher("[A-Za-z_]")
andisJavaIdentifierPart()
tonew CharacterMatcher("[A-Za-z_0-9]");
.
Actually, it's more complicated than I thought:
http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isJavaIdentifierPart(char) http://stackoverflow.com/questions/11774099/legal-identifiers-in-java
Any Unicode letter, currency symbol, or "connector" can be used.
Yeah - unicode characters complicate things, like Joel said.
Also, unless I'm missing something, it seems rather wasteful to me to implement isJavaIdentifier*
as deferring to a very general class. I'd be surprised if matching against a fixed set of character ranges (in hard-coded fashion) wasn't at least several times faster than using CharacterMatcher.
Oh, yes, I forgot Unicode characters...
I think you make the problem too complicated here. As I understand, our task isn't to validate the character. There are only 3 possible cases here:
[anything][last $][any-non-digit][anything] - that corresponds to inner class
[anything][last $][digits] - that corresponds to anonymous class
[anything][last $][digits][any-non-digit][anything] - that corresponds to a local class
We don't actually care what does [any-non-digit] part contain. It could be _ or [a-z] or [а-я] or anything else. The validation isn't our task.
Am I right?
Peeking at the openjdk code, it looks like they don't do any name parsing.
Rather:
On Wed, 1 Oct 2014, Joshua Warner wrote:
Peeking at the openjdk code, it looks like they don't do any name parsing.
Rather:
- They check whether getSimpleName returns an empty string (which, apparently it's supposed to do for anonymous classes)
But getSimpleName does do name parsing.
On Wed, 1 Oct 2014, Ilya Mizus wrote:
I think you make the problem too complicated here. As I understand, our task isn't to validate the character. There are only 3 possible cases here:
[anything][last $][digits] - that corresponds to isAnonymousClass() [anything][last $][digits][any-non-digit][anything] - that corresponds to isLocalClass()
We don't actually care what does [any-non-digit] part contain. It could be _ or [a-z] or [а-я](hey, I'm Russian ;) or anything else. The validation isn't our task.
Am I right?
Yeah, that makes sense.
Please take a look at my solution and tell me if you like it, or not:
private enum ClassType { Global, Member, Local, Anonymous }
private ClassType classType = null;
private void calculateClassType() {
char[] nameChars = getName().toCharArray();
// Searching for the last $
int lastDollarPos = -1;
for (int i = nameChars.length - 1; i >= 0; i--) {
if (nameChars[i] == '$') {
lastDollarPos = i;
break;
}
}
if (lastDollarPos > -1) {
int i = lastDollarPos + 1;
// Assuming i < nameChars.length cause the class name shouldn't end with '$'
if (nameChars[i] < '0' || nameChars[i] > '9') {
classType = ClassType.Member; // Not a digit after the last '$'
return;
}
i++;
while (i < nameChars.length) {
if (nameChars[i] < '0' || nameChars[i] > '9') {
classType = ClassType.Local; // Not-a-digit character after some digits after the last '$'
return;
}
i++;
}
classType = ClassType.Anonymous; // Only digits after the last '$'
return;
} else {
classType = ClassType.Global; // No '$' in class name means a global class
return;
}
}
public boolean isAnonymousClass () {
if (classType == null) calculateClassType();
return classType == ClassType.Anonymous;
}
public boolean isLocalClass () {
if (classType == null) calculateClassType();
return classType == ClassType.Local;
}
public boolean isMemberClass () {
if (classType == null) calculateClassType();
return classType == ClassType.Member;
}
I really like the enum - but I would much rather see this implemented (partially) in terms of getEnclosingClass() and getEnclosingMethod(), rather than parsing out the string. In fact, you can determine everything but Local vs Anonymous just checking whether those are the enclosing class/method are null.
Also, finding the '$' could be elided using getEnclosingClass().getName().length()
.
Lastly, I'm not sure caching it would be worth-while. These seem like methods that will only be called a small handful of times - and you're burning 8 bytes / class + dcache pollution effects.
I agree that 8 bytes per class is not a necessary overhead. But... do you think that getEnclosingClass().getName().length()
would be faster?
Obviously the only correct answer is "benchmark it".
And that's a good point, I'm not sure. The 2+ dependent pointer loads are not going to be cheap. On the other hand, neither is creating the new char array (the allocation, copying, and the garbage generated). On the third hand, using charAt() (which it seems avian doesn't have, and would be great to add) might incur some nasty branch mispredicts.
And to be clear, I'm not suggesting that you actually have to benchmark it. Just that, you're right, it might not be any faster. Point being, what you posted is fine (provided it matches the openjdk behavior).
On Wed, 1 Oct 2014, Joshua Warner wrote:
And to be clear, I'm not suggesting that you actually have to benchmark it. Just that, you're right, it might not be any faster. Point being, what you posted is fine (provided it matches the openjdk behavior).
Ditto. The proposed implementation, plus a few tests to prove it works, is fine by me.
It is a bit awkward to comment a diff that is not part of a commit (let alone a pull request), but I'll do it nevertheless:
enum
keys is to have them all upper-case.classType
with null
(it is initialized with null
implicitly unless you initialize it with something else).calculateClassType()
method returned the class type. That would avoid the repeated code pattern in the three callers. /**
* Determines the class type.
*
* There are four class types: global (no dollar sign), anonymous (only digits after the dollar sign),
* local (starts with digits after the dollar, ends in class name) and member (does not start with digits
* after the dollar sign).
*
* @return the class type
*/
private ClassType calculateClassType() {
if (classType != null) return classType;
classType = ClassType.GLOBAL;
final String name = getName();
// Find the last dollar, as classes can be nested
int dollar = name.lastIndexOf('$');
if (dollar < 0) return classType;
// Find the first non-digit after the dollar, if any
final char[] chars = name.toCharArray();
int skipDigits;
for (skipDigits = dollar + 1; skipDigits < chars.length; skipDigits++) {
if (chars[skipDigits] < '0' || chars[skipDigits] > '9') break;
}
if (skipDigits == chars.length) {
classType = ClassType.ANONYMOUS;
} else if (skipDigits > dollar + 1) {
classType = ClassType.MEMBER;
} else {
classType = ClassType.LOCAL;
}
return classType;
}
ArrayOutOfBoundsException
in the original implementation, hard to spot, too. In the rewritten version, I would claim, it is much easier to see that there is no chance of such an exception. And while it is true that javac
will not produce a class name that ends in a dollar, it is definitely allowed, as this little BeanShell snippet using Javassist shows:import javassist.ClassPool;
pool = ClassPool.getDefault();
c = pool.makeClass("ABC$");
print(c.toClass());
Oh, and of course I missed the comment about spending 8 bytes on this... I agree that it is fast enough to calculate, and users who really need recurring access to the calculated values should cache them themselves.
Please just replace the line if (classType == null) return classType;
with ClassType classType = ClassType.GLOBAL
and remove the now-obsolete line classType = ClassType.GLOBAL;
.
Ok, thank you, @dscho. By the way, we decided to remove classType
variable out of Class
class. Your version is pretty clear. I'll use it if you don't mind. And we could avoid the exception by change only one line:
if (dollar < 0 || dollar == name.length() - 1) return classType;
So we assume any class ending with '$' to be 'GLOBAL'. At least it would be valid. Do you agree? I'll make a pull request, of course, as soon as I get rid of all the missing functions... Or if I decide it's hopeless to make GSON work for now.
getGenericInterfaces: to get this, you'll need to check if Class.vmClass.addendum.signature is non-null.
Now about the Class.getGenericInterfaces()
function:
I'm prining:
System.out.println(vmClass.addendum.signature);
for CCC.class
, where:
interface III {}
class CCC implements III {
}
and it prints null
. What should I do?
Your version is pretty clear. I'll use it if you don't mind.
By all means!
And we could avoid the exception by change [...]
The new version should not throw that exception at all... The loop after the dollar would terminate right away because of the condition, without accessing any out-of-bounds index.
So we assume any class ending with '$' to be 'GLOBAL'.
Actually, I assumed they should be interpreted as ANONYMOUS
, hence my code picking that...
Now about the Class.getGenericInterfaces() function:
Try
interface III<T> {}
class CCC implements III<Object> {
}
@bigfatbrowncat to clarify: the signature in the addendum is present only if there are really generics involved. Otherwise, you will have to fall back to the non-generic method.
In other words, you will have to copy-edit Class#getInterfaces()
to special-case array items when there are generic signatures.
I forgot the exact details, but I did something similar in commit 2d8362297537414de11ea9fe3d76ee01f97c107a to implement Field#getGenericType()
. Hopefully you can reuse some of that!
On Thu, 2 Oct 2014, Ilya Mizus wrote:
Now about the
Class.getGenericInterfaces()
I'm prining:System.out.println(vmClass.addendum.signature);
for a class:
interface III {} class CCC implements III { }
and it prints
null
. What should I do?
Just call getInterfaces if the signature is null.
Okay. I found out my mistake with parameters. Now I have a question. As a result of Class.getGenericInterfaces()
call I should return an array of java.lang.reflect.ParameterizedType
interface implementations. Do we have a proper implementation to use? (JDK returns sun.reflect.generics.reflectiveObjects.ParameterizedTypeImpl
objects)
On Thu, 2 Oct 2014, Ilya Mizus wrote:
Okay. I found out my mistake with parameters. Now I have a question. As a result of
Class.getGenericInterfaces()
call I should return an array ofjava.lang.reflect.ParameterizedType
interface implementations. Do we have a proper implementation to use?
Not yet, sorry. You could write one with a constructor that takes a signature and a ClassLoader and parses that signature, loading the classes it refers to from the ClassLoader.
Okay. Where should I put the new class? avian
package? Maybe it is better to create avian.reflect
?
On Thu, 2 Oct 2014, Ilya Mizus wrote:
Okay. Where should I put the new class?
avian
package? Maybe it is better to createavian.reflect
?
Either of those is fine with me.
Am I correct that sun.reflect.generics.reflectiveObjects.TypeVariableImpl
should be implemented as well?
On Thu, 2 Oct 2014, Ilya Mizus wrote:
Am I correct that
sun.reflect.generics.reflectiveObjects.TypeVariableImpl
should be implemented as well?
I haven't heard of that class before. How did you find it? Why do you think you'll need it?
Whew... that wouldn't be a piece of cake, actually. It looks like I've stepped into a deep and dangerous Java reflection mechanisms :)
I've made a small testing facility to find out all the common situations that we could achieve here. Look. 1 First of all - some classes for testing:
package parrot.server;
class Main {
...
class CC {
}
interface III<A> {}
interface UUU<A> {}
interface ZZZ<A extends CC> {}
class CCC<AA, AAA extends CC> implements III<Object>, Cloneable, UUU<AA>, ZZZ<AAA> {
}
}
2 Then the testing code:
Type[] tt = CCC.class.getGenericInterfaces();
for (int i = 0; i < tt.length; i++) {
if (tt[i] instanceof ParameterizedType) {
ParameterizedType pt = (ParameterizedType)tt[i];
System.out.print("[pt: " + pt.toString() + "; ");
System.out.print("owner: " + ((Class)pt.getOwnerType()).getName() + "; ");
System.out.print("raw: " + ((Class)pt.getRawType()).getName() + "<");
Type[] ata = pt.getActualTypeArguments();
for (int j = 0; j < ata.length; j++) {
if (ata[j] instanceof TypeVariable<?>) {
TypeVariable<?> tv = (TypeVariable<?>)ata[j];
System.out.print(tv.getClass() + ":");
System.out.print(tv.getGenericDeclaration() + ":");
System.out.print(tv.getBounds().length + ", ");
} else {
System.out.print(ata[j].getClass() + ", ");
}
}
System.out.println(">");
} else {
Class cls = (Class)tt[i];
System.out.print("{cls: " + cls.getName() + "}\n");
}
}
3 Testing code output
[pt: parrot.server.Main.parrot.server.Main$III<java.lang.Object>; owner: parrot.server.Main; raw: parrot.server.Main$III<class java.lang.Class, >
{cls: java.lang.Cloneable}
[pt: parrot.server.Main.parrot.server.Main$UUU<AA>; owner: parrot.server.Main; raw: parrot.server.Main$UUU<class sun.reflect.generics.reflectiveObjects.TypeVariableImpl:class parrot.server.Main$CCC:1, >
[pt: parrot.server.Main.parrot.server.Main$ZZZ<AAA>; owner: parrot.server.Main; raw: parrot.server.Main$ZZZ<class sun.reflect.generics.reflectiveObjects.TypeVariableImpl:class parrot.server.Main$CCC:1, >
4 Avian signature (converted to string):
<AA:Ljava/lang/Object;AAA:Lparrot/server/Main$CC;>Ljava/lang/Object;Lparrot/server/Main$III<Ljava/lang/Object;>;Ljava/lang/Cloneable;Lparrot/server/Main$UUU<TAA;>;Lparrot/server/Main$ZZZ<TAAA;>;
Look at TAAA
in the end of the signature. It turns into TypeVariable
On Thu, 2 Oct 2014, Ilya Mizus wrote:
TypeVariable<?> tv = (TypeVariable<?>)ata[j]; System.out.print(tv.getClass() + ":");
Calling getClass() on a TypeVariable instance isn't very useful. Here are methods that might be useful, though:
http://docs.oracle.com/javase/7/docs/api/java/lang/reflect/TypeVariable.html
We haven't added those methods to Avian's version of TypeVariable yet, because they weren't needed until now. If Guava needs them, maybe we should add them.
But, yeah, when you start mixing generics and reflection, things get really complicated :)
So the task is: parse signature and make a structure of ParameterizedType, TypeVariable and Class objects... Cool. I'll try to do that...
Well, we kind of have a ParameterizedType implementation: https://github.com/ReadyTalk/avian/blob/master/classpath/java/lang/reflect/SignatureParser.java#L103
parse signature and make a structure of ParameterizedType, TypeVariable and Class objects
See https://github.com/ReadyTalk/avian/blob/master/classpath/java/lang/reflect/SignatureParser.java
See https://github.com/ReadyTalk/avian/blob/master/classpath/java/lang/reflect/SignatureParser.java
To be precise: the parseType()
method already does a lot of that work. It needs to be refactored, of course, to support getGenericInterfaces()
but that is quite doable. The first thing would be to decouple the handling of what is inside <...>
because the current code can only handle generic signatures of fields (which always have an explicit base type, but the base type of a class is implicit).
I just realized that I completely missed to document in the commit 2d8362297537414de11ea9fe3d76ee01f97c107a where I found the relevant information: it was section 4.3 of the Java Virtual Machine Specification (you will find section 4.3.4 especially helpful).
BTW I will be offline for three days; please do not mistake this as a lack of interest from my side (but also: don't let this hold up any progress, although I am of course happy to invest some time to help resolve this issue once I return online)!
Thank you, @dscho. I'm not sure I could solve that without your help. This looks very complicated. I'll tell you when I have some progress.
Please take a look at commits https://github.com/bigfatbrowncat/avian-pack.avian/commit/67cafb118cfeb0416ae795533f5675eda263e388 and https://github.com/bigfatbrowncat/avian-pack.avian/commit/270bbc66f9aa1152907889bf733a2927277a6da5. Both of them are part of pull request https://github.com/ReadyTalk/avian/pull/343
I've made support for everything excepting TypeVariable
-s so far. I had to make SignatureParser
public (maybe we have to move it to avian
package) and improve it a bit (to handle owner class for some case)
Please criticize my code and then I'll make TypeVariable
parsing working.
I've fixed test with a small change in https://github.com/bigfatbrowncat/avian-pack.avian/commit/50d32d39e70a0dd2a911d4bc86ff7ae247fcd691.
We've implemented getGenericInterfaces()
completely (I hope so). Now even type variables work.
But we have a problem with the simple getInterfaces()
function which is a no-template version of getGenericInterfaces()
(and we fall back to it if addendum.signature == null
).
Example:
interface I1 {}
interface I2 extends I1 {}
class NoGen implements I2 {
}
...
Class[] ifaces = NoGen.class.getInterfaces();
for (Class c : ifaces) {
System.out.println(c);
}
JDK result:
interface reflect.test.Main$I2
Avian (my latest commit https://github.com/bigfatbrowncat/avian-pack.avian/commit/c178ffd26d5829dcbc4e1c3a9120585e889d3381):
interface reflect.test.Main$I1
interface reflect.test.Main$I2
The problem looks obvious: JDK lists only the directly implemented interfaces, but Avian lists all of them. I don't know how to fix this behavior, please, help me with your advice.
And another (aesthetic) problem: I added TypeVariableImpl
class and a very useful collectTypeVariables()
function directly to the Class
class. Maybe they should form some utility class and appear somewhere else (like SignatureParser
does).
Okay, I've fixed the build :) I moved the class and the function into SignatureParser
and changed the parse
signature. Now Field
's getGenericType()
function should support type variables as well. The new commit is: https://github.com/bigfatbrowncat/avian-pack.avian/commit/1534a30d2b8eb64838a3aece0d5c8dccb85f20be
On 10/5/14, Ilya Mizus notifications@github.com wrote:
The problem looks obvious: JDK lists only the directly implemented interfaces, but Avian lists all of them. I don't know how to fix this behavior, please, help me with your advice.
That's a bug. I'll fix it when I get a chance.
It would be nice cause GSON doesn't want to work cause of it :)
As I could see, getInterfaces()
function takes the data form the interfaceTable
that contains all the implemented interfaces. But it should take it from the class declaration instead.
My question: how to take it from the declaration if addendum.signature == null
?
We (me and @JustAMan) have tried using GSON (very popular Java JSON serializing/deserializing library) with Avian + Android classpath, and it turns out that
Class
misses several methods that are needed by GSON. Also note thatClass.java
is always taken from Avian classpath, so this issue is relevant for all Avian versions (i.e. both Avian's own classpath and Android classpath).Namely:
are missing or throwing
UnsupportedOperationException
.We cannot progress further in finding what GSON needs except those three as we cannot stub
getGenericInterfaces
with something meaningful (stubbing it with empty list results inArrayIndexOutOfBoundsException
somewhere in the code path so GSON needs that method to provide correct results).It would be wonderful if you could share your thoughts on where we could get the data needed for those methods if you cannot provide a quick fix yourself.