Remove use of `usize` from the VM core struct

lambdaclass / cairo-vm

cairo-vm is a Rust implementation of the Cairo VM. Cairo (CPU Algebraic Intermediate Representation) is a programming language for writing provable programs, where one party can prove to another that a certain computation was executed correctly without the need for this party to re-execute the same program.

https://lambdaclass.github.io/cairo-vm

Apache License 2.0

518 stars 148 forks source link

Remove use of `usize` from the VM core struct #1498

Open tdelabro opened 11 months ago

tdelabro commented 11 months ago

Problem

The VM execution should be deterministic regardless of if it is executed on a 16, 32 or 64 bit processor. Having usize being used in order to define some of the core VM types (segment offset, ap, fp) is problematic because it could lead to different execution on different machines. Which, in a decentralized blockchain architecture will lead to hard forks.

Recomandation

1) Get rid of all the usize in the cairo-vm stuct. I think it's still okay to use usize when we call language method like len() but we should convert those as soon as possible to types that have a fixed behaviour regardless of the machine.

2) Ban the use of as. as silence the value overflows. This is not something we want at all in a blockchain context. We should use the very explicit try_from/into instead and handle errors accordingly.

Impl

I would reccomand we introduce a new wrapper struct: SegmentOffset(u64) that will be used instead of usize in all Relocatable, ap and fp. This will require impl a lot of conversion methods on it, but it will give us the level of security we need for such a core component of our infrastructure.

Oppen commented 11 months ago

I would reccomand we introduce a new wrapper struct: SegmentOffset(u64) that will be used instead of usize in all Relocatable, ap and fp. This will require impl a lot of conversion methods on it, but it will give us the level of security we need for such a core component of our infrastructure.

I don't really understand this part. Why not just make the offset u64? If it's just to avoid as, maybe there's a clippy lint for that?

Oppen commented 11 months ago

Re: execution on different architectures: AFAICT, in the cases we could observe overflows in the offsets execution would fail due to exceeding the addressable space anyway, so usize should work well for any program that can in fact run correctly in smaller architectures. Not that it wouldn't be more correct to use explicit sizes anyway (I also have a more compact u64 representation for Relocatable in mind for traces).

tdelabro commented 11 months ago

Why not just make the offset u64?

It would work, but next time you want to change, lets say to u32, it will be hell again. With the correct encapsulation it is easier to change in the future. I would at least use an alias type, for code readability, rather than u64. There could also be some operation that make sense if we talk about u64 but don't if we consider memory offset. So, we could want to impl different, more specific, behaviour for the type that represent an segment offset.

execution on different architectures: AFAICT, in the cases we could observe overflows in the offsets execution would fail due to exceeding the addressable space anyway, so usize should work well for any program that can in fact run correctly in smaller architectures.

I think you are correct on this one. But it mean that it work "by luck" and it will be okay until it isn't. So yes I would prefer the more restrictive architecture.

Also some traits like parity-scale-codec::{Encode, Decode} are not implemented on usize. So getting rid of it would be more handy..

tdelabro commented 11 months ago

Same goes with isize used as segment_index in Relocatable

Oppen commented 11 months ago

Sounds reasonable.

tdelabro commented 9 months ago

@Oppen @igaray any new on this? Is it planned? It will become a requirement for Madara soon as it can break consensus super easily

tdelabro commented 8 months ago

@Oppen @igaray 👋 Could be nice to fix that before the 1.0.0 release, as it is a big breaking change

fmoletta commented 8 months ago

Hello! I understand the need to have deterministic execution which doesn't change between different architectures but I don't think changing the type of the address' offsets to u64 will fully solve this. The memory structure of the VM is made up of vectors that each represent a memory segment, the maximum capacity of a memory segment is given by usize::MAX, therefore depending on the architecture on which the VM is run. Even if addresses are changed to represent offsets up to u64::MAX, the memory they address is still constrained to usize::MAX. A program which fails to run on a 32bit architecture due to exceeding the addressable memory will still fail after this change and it will still run correctly on a 64bit architecture

tdelabro commented 8 months ago

the maximum capacity of a memory segment is given by usize::MAX

Yeah that sound like a bad design. It should be a u64 or a u128 or whatever. But it should be defined by the starknet spec, not by the machine the program runs on. Otherwise the virtual machine is not that virtual, it end up being tied up to the actual hardware.

When are you planning to solve all of this @fmoletta? As I say, it could be great to have it before 1.0.0 release

fmoletta commented 8 months ago

the maximum capacity of a memory segment is given by usize::MAX

Yeah that sound like a bad design. It should be a u64 or a u128 or whatever. But it should be defined by the starknet spec, not by the machine the program runs on. Otherwise the virtual machine is not that virtual, it end up being tied up to the actual hardware.

When are you planning to solve all of this @fmoletta? As I say, it could be great to have it before 1.0.0 release

What I meant by that line is that the maximum capacity of a vector in rust is usize::MAX, therefore the maximum capacity for a memory segment is usize::MAX. It's not something we can change without changing the whole structure of the memory

tdelabro commented 8 months ago

Well, that is a bad design. The max size of a segment should be decided by the VM specs. Not by some implementation details which makes it depend on the hardware it is running on.

Once you decide that the max size of a segment is x (maybe 2^251-1, or something else), you implement it in a way that makes it respect the specs, regardless of the hardware. Not the other way around.

tdelabro commented 8 months ago

Regardless, this is a very specific edge case that most likely will never happen in an actual run. Removing usize from everywhere else is still something relevant