Open Arrowana opened 2 years ago
You can use the compute budget program to request more heap memory, I think you can request around double the heap memory
--
I think the biggest issue comes from way vectors are used. If you look at the implementations of to_account_infos
, etc.. they all simply initialize the vector with an empty capacity using Vec::new()
. If you are going to push 12 accounts onto this vector, your will end up reallocating memory 5 times in total.
You can drastically reduce memory allocations using Vec::with_capacity(est_num_accounts)
instead, and because you wouldn't reallocate memory you are not going to waste memory as the vector reallocates.
I am aware of both of those solutions but i don't think they are great in general:
One solution that I'm investigating for the new set of constraints is switching account_info
to being accessed as a zero copy
struct instead of copying it to the heap. That should solve quite a bit of that overhead, right?
Ok, so to_account_metas
and to_account_infos
don't contain remaining_accounts
from what i can see. But the reason they have the system where they extend a vec with vec![]
is because anchor supports nested derive(Accounts)
struct. So you don't know that the account type might actually be another derive(Accounts)
struct with more than one account on them.
If it really is the case that the reassigning of the vecs is the major issue here, then it's possible we could add a len
attribute or something similar to the account types. This would allow for allocating the length of the vector correctly up front? (although maybe optional accounts messes with this 🤷 )
was there a clean solution to this @Arrowana ?
was there a clean solution to this @Arrowana ?
No full solution yet, there is some major savings possible trying to carefully create any vec with Vec::with_capacity
Also while this gobbles quite some heap, the other area for improvement is the usage of BorshSerialize::try_to_vec
it is used in data(&self) -> Vec<u8>
for ix data to cpi, event, new event_cpi...
It calls Vec::with_capacity(1024)
, so instantly no matter the size of what is serialized, <=1024 - 8 is allocated for nothing.
I believe a rework is needed to potentially leverage the existing proc macro of InitSpace, so that we can have auto max sizing to allow each implementation to become something like
pub trait InstructionData: Discriminator + AnchorSerialize + Space {
fn data(&self) -> Vec<u8> {
let mut d = Vec::with_capacity(8 + Self::INIT_SPACE);
d.extend(Self::discriminator());
self.serialize(&mut d).expect("Should always serialize"));
d
}
}
#[automatically_derived]
impl anchor_lang::Space for #name {
const INIT_SPACE: usize = 1 + #max;
}
Or with auto sizing optional
I hadn't thought of that, but it could be a good solution. Also, implementing the Iterator trait instead of using vectors could be a good approach
I believe a rework is needed to potentially leverage the existing proc macro of InitSpace, so that we can have auto max sizing to allow each implementation to become something like
pub trait InstructionData: Discriminator + AnchorSerialize + Space { fn data(&self) -> Vec<u8> { let mut d = Vec::with_capacity(8 + Self::INIT_SPACE); d.extend(Self::discriminator()); self.serialize(&mut d).expect("Should always serialize")); d } } #[automatically_derived] impl anchor_lang::Space for #name { const INIT_SPACE: usize = 1 + #max; }
Or with auto sizing optional
I'm on board with decreasing the usage of BorshSerialize::try_to_vec
on program side but from what I can see, InstructionData
is only being used from client side and it shouldn't cause any extra heap allocations on program side.
I believe a rework is needed to potentially leverage the existing proc macro of InitSpace, so that we can have auto max sizing to allow each implementation to become something like
pub trait InstructionData: Discriminator + AnchorSerialize + Space { fn data(&self) -> Vec<u8> { let mut d = Vec::with_capacity(8 + Self::INIT_SPACE); d.extend(Self::discriminator()); self.serialize(&mut d).expect("Should always serialize")); d } } #[automatically_derived] impl anchor_lang::Space for #name { const INIT_SPACE: usize = 1 + #max; }
Or with auto sizing optional
I'm on board with decreasing the usage of
BorshSerialize::try_to_vec
on program side but from what I can see,InstructionData
is only being used from client side and it shouldn't cause any extra heap allocations on program side.
No, it does. When using the cpi ixs, or when emitting events it is also used. So this is happening program side. I wasn't clear enough but all serialization steps are affected, excepted directly into an account data
I believe a rework is needed to potentially leverage the existing proc macro of InitSpace, so that we can have auto max sizing to allow each implementation to become something like
pub trait InstructionData: Discriminator + AnchorSerialize + Space { fn data(&self) -> Vec<u8> { let mut d = Vec::with_capacity(8 + Self::INIT_SPACE); d.extend(Self::discriminator()); self.serialize(&mut d).expect("Should always serialize")); d } } #[automatically_derived] impl anchor_lang::Space for #name { const INIT_SPACE: usize = 1 + #max; }
Or with auto sizing optional
I'm on board with decreasing the usage of
BorshSerialize::try_to_vec
on program side but from what I can see,InstructionData
is only being used from client side and it shouldn't cause any extra heap allocations on program side.No, it does. When using the cpi ixs, or when emitting events it is also used. So this is happening program side. I wasn't clear enough but all serialization steps are affected, excepted directly into an account data
I think I wasn't clear enough either. I'm on board with optimizing the usage of BorshSerialize::try_to_vec
on program side(which is being used in CPI, events and return data) but the example code you've given is about InstructionData
which only seems to be used from client side. Could you share the usage of InstructionData
from program side if I'm missing something here?
I believe a rework is needed to potentially leverage the existing proc macro of InitSpace, so that we can have auto max sizing to allow each implementation to become something like
pub trait InstructionData: Discriminator + AnchorSerialize + Space { fn data(&self) -> Vec<u8> { let mut d = Vec::with_capacity(8 + Self::INIT_SPACE); d.extend(Self::discriminator()); self.serialize(&mut d).expect("Should always serialize")); d } } #[automatically_derived] impl anchor_lang::Space for #name { const INIT_SPACE: usize = 1 + #max; }
Or with auto sizing optional
I'm on board with decreasing the usage of
BorshSerialize::try_to_vec
on program side but from what I can see,InstructionData
is only being used from client side and it shouldn't cause any extra heap allocations on program side.No, it does. When using the cpi ixs, or when emitting events it is also used. So this is happening program side. I wasn't clear enough but all serialization steps are affected, excepted directly into an account data
I think I wasn't clear enough either. I'm on board with optimizing the usage of
BorshSerialize::try_to_vec
on program side(which is being used in CPI, events and return data) but the example code you've given is aboutInstructionData
which only seems to be used from client side. Could you share the usage ofInstructionData
from program side if I'm missing something here?
oups, correct it isn't. So not InstructionData but the other ones.
Problem
With transaction v2 and address lookup tables, it is now possible to do a lot more in a single program call and to do a lot more overall. In fact, Jupiter is already capable of consuming up to 64 accounts (current account lock max) for routed swaps through multiple pools.
A new issue that arise is that the default heap is 32 * 1032 and the default bump allocator (does not deallocate). As a result we start to see cases where we run out of heap.
https://github.com/solana-labs/solana/blob/d9b0fc0e3eec67dfe4a97d9298b15969b2804fab/sdk/program/src/entrypoint.rs#L101-L132
Example trying to find out where the heap goes
Using the default BumpAllocator and logging of the pointer position I produced this log for a single CPI
one CPI to mercurial pool (using the to_account_infos and to_account_metas pattern) as does the CpiContext
8 accounts AccountMeta => 8 34 = 408 AccountInfo => 12 ? = ?
The log of the diff of the pointer position for each step of allocation
so 2056 bytes of heap gone for that simple step.
I can produce a minimal repo to repro if people are interested but using the link from entrypoint.rs is pretty much what i did
in comparison a vanilla token transfer call: https://github.com/solana-labs/solana-program-library/blob/d02d49e2671c6af06633d21ecdb8970423880c75/token-swap/program/src/processor.rs#L187 allocates only AccountMetas in a ve for the Instruction accounts field.
Possible solutions
Side effect