Rationalize char and array in ABI

aionnetwork / AVM

Enabling Java code to run in a blockchain environment

https://theoan.com/

MIT License

49 stars 25 forks source link

Rationalize char and array in ABI #329

Closed jeff-aion closed 5 years ago

jeff-aion commented 5 years ago

The ABI specification (https://github.com/aionnetwork/AVM/wiki/ABI-Specification) currently has some special-cases around arrays, which we should probably remove or explain in greater detail (Address and String do not support the same array arity as the primitive types).

Additionally, the char type should be specified as the big-endian encoding of the 2-byte Java char primitive type but the current implementation relies on a UTF-8 encoding of the String created from that single char. This complicates the encoding, since it is variable-length, but is also not correct since the range of UTF-8 and UCS-2 are different (meaning a malformed encoding could be valid UTF-8 but not part of UCS-2).

jeff-aion commented 5 years ago

Looking further into adding more tests around the existing ABI to prove this change will be correct, I found that the ABIDecoder currently uses GeneratedClassesFactory to get access to the generated array classes, implemented in the core package. The mechanism that GeneratedClassesFactory uses to do this is brittle: it knows the naming convention that the array generator uses. This kind of "secret handshake" should not be shared across the core/rt boundary.

My plan is to replace GeneratedClassesFactory with a high-level interface which the NodeEnvironment will populate into the ABIDecoder, on startup (instead of passing the special class loader through to it). The implementation will either move to core as this factory or will be added directly to a class which already has this array class naming convention knowledge.

jeff-aion commented 5 years ago

On a similar note, we probably also need to create some kind of type name mapper interface so that we can move the PackageConstants to core (it is more of an implementation detail than aspect of the API) and replace any package name assumptions with high-level mapping operations.

I suppose that this item is becoming more about how to decide whether a class should go into the core or rt package, which is worth formalizing, now that this is moving from prototype into public testing.

My current guideline is that the rt package is for things directly called by user code which can be aware of implementation details but would ideally remain ignorant to those details, defining high-level interfaces for any other information that they require. This would mean that the only things in the RT package would be user-callable classes and the interfaces used by those classes.

jeff-aion commented 5 years ago

The work originally being done under this item has now been subsumed by the more ambitious #330.