Closed jstarry closed 1 year ago
@armaniferrante could you share the status of the IDL you've developed as part of Anchor and whether you think it would be a good solution to this issue?
The anchor IDL is defined by the JSON serialization of the struct here. Some examples can be found here. It's currently used to generate clients with @project-serum/anchor.
It could definitely be used for this issue. To start, one might want to trim down the IDL linked above (which has some anchor specific concepts like events, state, and errors) and begin with the basics, e.g., something like,
pub struct Idl {
pub version: String,
pub name: String,
pub instructions: Vec<IdlIx>,
pub accounts: Vec<IdlTypeDef>,
pub types: Vec<IdlTypeDef>,
}
Other than bike shedding the nitty details of the exact JSON format, I think the main thing missing for this issue would be a serialization format field, because currently the IDL assumes everything is borsh serialized. Depending on what current programs on Solana are doing, we might want to allow other formats like bincode. The main challenge that comes to mind is what to do about programs with custom serialization, like the Serum DEX.
Additionally for this issue, it's important that IDLs live at some deterministic address on chain, so that apps like the explorer can query the IDL with nothing but the program ID and make sense of the instruction data. Anchor uses a PDA with fixed seeds for this (the macro codegen bakes into the program some extra instructions to do this), but this of course doesn't work for non anchor programs, so there probably needs to be some type of associated idl program for this, as I've briefly discussed with @bartosz-lipinski. (Fwiw I think my ideal solution would be to bake the IDL into the bytecode itself, using something like a custom section in .wasm files. But I'm not sure if that's possible with BPF and so the associated idl program is probably the most realistic solution. Edit. I agree with @jstarry's comment below that baking into the program is a bad idea as it would make deserializing transactions to old versions of the program more difficult.)
Thanks @armaniferrante for the details!
+1 to a serialization format field
Starting with borsh support sounds like a good start. Not sure the best route for bincode / custom deserialization. Could leverage wasm for this?
I like the idea of a deterministic place for IDL's! If we included it in the bytecode, then we might lose the ability to deserialize historical transactions whenever a program is upgraded. Because of upgrades, I think the seed should include both the program id as well as the program's last upgraded slot.
Tracking this issue. @bartosz-lipinski and I have also been discussing an associated IDL / ABI-type program. This would help our Explorer efforts immensely. We've also considered including some additional metadata e.g. project name, github url.
Some additional details that may or may not be relevant.
sha256(rust-ix-struct-ident)[..8] || borsh(rust-ix-struct)
. Sighash is important to be able to support features like program interfaces (example definition and impl), where a program wants to call another program, without assuming anything other than it implements the interface. This isn't possible with enum based dispatch, since there may be a collision in the enum variant discriminator. Anchor addresses this with sighash by namespacing methods, e.g., by prefacing the #[interface]
trait name in the sha256 pre-image).I was playing around with a crude implementation of this feature for Anchor IDLs. It doesn't take into consideration non-anchor IDLs, upgraded programs, etc.
Just doing a basic check to see if there's an Anchor IDL for the account's owner, if so decode and display the info in a tab in the details section.
Curious what you all think of this as a first step?
Existing Anchor IDLs are used to deserialize account data in the explorer (as of #23972 and #24239).
Would love to revive this conversation and set out some concrete steps to go further.
Low hanging fruit:
@jstarry @armaniferrante @oJshua @bartosz-lipinski
Problem
Account and instruction deserialization and labelling is done manually and is not scalable to all community created programs
Proposed Solution