Open tdelabro opened 11 months ago
I would reccomand we introduce a new wrapper struct: SegmentOffset(u64) that will be used instead of usize in all Relocatable, ap and fp. This will require impl a lot of conversion methods on it, but it will give us the level of security we need for such a core component of our infrastructure.
I don't really understand this part. Why not just make the offset u64
? If it's just to avoid as
, maybe there's a clippy lint for that?
Re: execution on different architectures: AFAICT, in the cases we could observe overflows in the offsets execution would fail due to exceeding the addressable space anyway, so usize
should work well for any program that can in fact run correctly in smaller architectures. Not that it wouldn't be more correct to use explicit sizes anyway (I also have a more compact u64
representation for Relocatable
in mind for traces).
Why not just make the offset u64?
It would work, but next time you want to change, lets say to u32, it will be hell again. With the correct encapsulation it is easier to change in the future. I would at least use an alias type, for code readability, rather than u64. There could also be some operation that make sense if we talk about u64 but don't if we consider memory offset. So, we could want to impl different, more specific, behaviour for the type that represent an segment offset.
execution on different architectures: AFAICT, in the cases we could observe overflows in the offsets execution would fail due to exceeding the addressable space anyway, so usize should work well for any program that can in fact run correctly in smaller architectures.
I think you are correct on this one. But it mean that it work "by luck" and it will be okay until it isn't. So yes I would prefer the more restrictive architecture.
Also some traits like parity-scale-codec::{Encode, Decode} are not implemented on usize. So getting rid of it would be more handy..
Same goes with isize
used as segment_index
in Relocatable
Sounds reasonable.
@Oppen @igaray any new on this? Is it planned? It will become a requirement for Madara soon as it can break consensus super easily
@Oppen @igaray 👋 Could be nice to fix that before the 1.0.0 release, as it is a big breaking change
Hello!
I understand the need to have deterministic execution which doesn't change between different architectures but I don't think changing the type of the address' offsets to u64 will fully solve this.
The memory structure of the VM is made up of vectors that each represent a memory segment, the maximum capacity of a memory segment is given by usize::MAX
, therefore depending on the architecture on which the VM is run. Even if addresses are changed to represent offsets up to u64::MAX, the memory they address is still constrained to usize::MAX. A program which fails to run on a 32bit architecture due to exceeding the addressable memory will still fail after this change and it will still run correctly on a 64bit architecture
the maximum capacity of a memory segment is given by usize::MAX
Yeah that sound like a bad design. It should be a u64 or a u128 or whatever. But it should be defined by the starknet spec, not by the machine the program runs on. Otherwise the virtual machine is not that virtual, it end up being tied up to the actual hardware.
When are you planning to solve all of this @fmoletta? As I say, it could be great to have it before 1.0.0 release
the maximum capacity of a memory segment is given by usize::MAX
Yeah that sound like a bad design. It should be a u64 or a u128 or whatever. But it should be defined by the starknet spec, not by the machine the program runs on. Otherwise the virtual machine is not that virtual, it end up being tied up to the actual hardware.
When are you planning to solve all of this @fmoletta? As I say, it could be great to have it before 1.0.0 release
What I meant by that line is that the maximum capacity of a vector in rust is usize::MAX, therefore the maximum capacity for a memory segment is usize::MAX. It's not something we can change without changing the whole structure of the memory
Well, that is a bad design. The max size of a segment should be decided by the VM specs. Not by some implementation details which makes it depend on the hardware it is running on.
Once you decide that the max size of a segment is x (maybe 2^251-1, or something else), you implement it in a way that makes it respect the specs, regardless of the hardware. Not the other way around.
Regardless, this is a very specific edge case that most likely will never happen in an actual run. Removing usize from everywhere else is still something relevant
Problem
The VM execution should be deterministic regardless of if it is executed on a 16, 32 or 64 bit processor. Having
usize
being used in order to define some of the core VM types (segment offset, ap, fp) is problematic because it could lead to different execution on different machines. Which, in a decentralized blockchain architecture will lead to hard forks.Recomandation
1) Get rid of all the
usize
in the cairo-vm stuct. I think it's still okay to useusize
when we call language method likelen()
but we should convert those as soon as possible to types that have a fixed behaviour regardless of the machine.2) Ban the use of
as
.as
silence the value overflows. This is not something we want at all in a blockchain context. We should use the very explicittry_from/into
instead and handle errors accordingly.Impl
I would reccomand we introduce a new wrapper struct:
SegmentOffset(u64)
that will be used instead ofusize
in allRelocatable
,ap
andfp
. This will require impl a lot of conversion methods on it, but it will give us the level of security we need for such a core component of our infrastructure.