coral-xyz / anchor

⚓ Solana Sealevel Framework
https://anchor-lang.com
Apache License 2.0
3.36k stars 1.25k forks source link

Reduce stack usage of `try_accounts` #3060

Open LucasSte opened 3 days ago

LucasSte commented 3 days ago

I have been investigating complaints about invalid stack accesses in the compiler toolchain that happen with Anchor contracts. Most problems occur with function try_accounts, whose constraints expansion eats up a lot of stack space.

https://github.com/coral-xyz/anchor/blob/5f1f72c511102ad49b0f4b9361efd714b6187e13/lang/syn/src/codegen/accounts/try_accounts.rs#L121

I also have found some items that you may be aware of:

  1. The compiler has been missing stack errors. This is why sometimes you don't see a build error, but your contract fails during tests. This is going to be fixed in the next compiler release.
  2. As for the problems access violations may cause, we are considering not emitting binaries for contracts that access invalid memory. This is going to be for future releases, because tools would need time to adapt.
  3. There will be a refactor of the error message, making it clearer and easier to debug.

The try_accounts function is the only one that gets into the contract binary, so it advisable to break it in small methods (I had a look myself but found it very intricate).

Functions linearize and parse_account_field also have compiler errors, but they are not put into the contract binary. Perhaps, we are building them for Solana unnecessarily. We could also decrease their stack usage, but this case is simpler than try_accounts and should be less priority.

https://github.com/coral-xyz/anchor/blob/ca7fcee6b8269b732b66536f72ff3fb48cf1b5f9/lang/syn/src/codegen/accounts/constraints.rs#L71

https://github.com/coral-xyz/anchor/blob/ca7fcee6b8269b732b66536f72ff3fb48cf1b5f9/lang/syn/src/parser/accounts/mod.rs#L293

acheroncrypto commented 3 days ago

Thank you for creating this issue and giving a thorough explanation.

  1. The compiler has been missing stack errors. This is way sometimes you don't see a build error, but your contract fails during tests. This is going to be fixed in the next compiler release.

Awesome! This has been very inconsistent in my experience.

2. As for the problems access violations may cause, we are considering not emitting binaries for contracts that access invalid memory. This is going to be for future releases, because tools would need time to adapt.

Are you able to tell whether that code path is going to be used? It currently shows the stack errors even if that code path is unreachable the compiled program.

The try_accounts function is the only one that gets into the contract binary, so it advisable to break it in small methods (I had a look myself but found it very intricate).

Yeah, the biggest offender is the init constraint as mentined in https://github.com/coral-xyz/anchor/issues/2920. I find splitting a function into multiple functions (or even using a closure like in https://github.com/coral-xyz/anchor/pull/2939) to reduce stack usage weird. Do you think it would eventually be possible for the compiler to do this optimization automatically?

Functions linearize and parse_account_field also have compiler errors, but they are not put into the contract binary. Perhaps, we are building them for Solana unnecessarily. We could also decrease their stack usage, but this case is simpler than try_accounts and should be less priority.

https://github.com/coral-xyz/anchor/blob/ca7fcee6b8269b732b66536f72ff3fb48cf1b5f9/lang/syn/src/codegen/accounts/constraints.rs#L71

https://github.com/coral-xyz/anchor/blob/ca7fcee6b8269b732b66536f72ff3fb48cf1b5f9/lang/syn/src/parser/accounts/mod.rs#L293

What's the point of reducing their stack usage if they are only being used in compile-time? As you pointed out, they don't exist in programs, and they are only being used inside macros.

LucasSte commented 3 days ago

Are you able to tell whether that code path is going to be used? It currently shows the stack errors even if that code path is unreachable the compiled program.

Rust should tell you that you have unreachable code. I am not sure if it works with macros, though. This is a good case to investigate. I'd be happy if you can provide a reproducible example.

Do you think it would eventually be possible for the compiler to do this optimization automatically?

We may look into promoting items to the heap, but this would overshadow the programming language. Rust already let developers choose where to allocate (e.g. by using Box).

What's the point of reducing their stack usage if they are only being used in compile-time?

If we end up blocking compilation of functions that access memory outside their frames, this case would show up because it is going through the Solana backend. Maybe we can just investigate why it is being compiler for Solana then.

LucasSte commented 3 days ago

What's the point of reducing their stack usage if they are only being used in compile-time? As you pointed out, they don't exist in programs, and they are only being used inside macros.

I just removed some functions from the Solana target and these extra errors go away. There are still other parts being built for Solana. Hopefully, the fix is very simple.

https://github.com/coral-xyz/anchor/pull/3062

acheroncrypto commented 3 days ago

Rust should tell you that you have unreachable code. I am not sure if it works with macros, though. This is a good case to investigate. I'd be happy if you can provide a reproducible example.

It's not general unreachable code, but rather unreachable code from one's program. For example, if you were to add anchor-syn to your dependency list, you'd get the stack errors even though that code never executes (like in https://github.com/coral-xyz/anchor/pull/3062).

We may look into promoting items to the heap, but this would overshadow the programming language. Rust already let developers choose where to allocate (e.g. by using Box).

Isn't the optimization here (or in https://github.com/coral-xyz/anchor/pull/2939) only for the stack? We're not really trying to improve stack usage by moving things to heap, just using a different stack frame instead. It seems to me that the compiler has enough information about the potential stack usage and can do optimizations in a more general way.