Differences in decoded memory traces in the integration tests

MaksymMalicki commented 1 month ago

The problem: Currently, in the integration tests we are comparing decoded memory and execution traces of Nethermind Cairo-VM with the Starkware Cairo-VM implemented in Python. When we try to run the following Cairo0 programs as examples in Nethermind VM and Starkware VM with layout small, we can notice the differences in decoded memory traces, which are prompted by the integration tests.

%builtins range_check
from starkware.cairo.common.math import assert_le_felt

func main{range_check_ptr: felt}() {
    assert_le_felt(1, 2);
    assert_le_felt(-2, -1);
    assert_le_felt(2, -1);
    return ();
}

%builtins ecdsa
from starkware.cairo.common.cairo_builtins import SignatureBuiltin
from starkware.cairo.common.signature import verify_ecdsa_signature

func main{ecdsa_ptr: SignatureBuiltin*}() {
    verify_ecdsa_signature(
        2718,
        1735102664668487605176656616876767369909409133946409161569774794110049207117,
        3086480810278599376317923499561306189851900463386393948998357832163236918254,
        598673427589502599949712887611119751108407514580626464031881322743364689811,
    );
    return ();
}

After conducting some digging in the code of both VMs, I noticed, that the segments offsets, which are significant in memory relocation differ between the VMs. When running the following program:

%builtins range_check
from starkware.cairo.common.find_element import find_element
from starkware.cairo.common.alloc import alloc

func main{range_check_ptr}() -> () {
    alloc_locals;
    let (local array_ptr: felt*) = alloc();
    assert array_ptr[0] = 1;

    let (element_ptr: felt*) = find_element(
        array_ptr=array_ptr, elm_size=1, n_elms=1, key=1
    );
    assert [element_ptr] = 1;

    return ();
}

We will notice, that the segments offsets for relocation look like this:

Nethermind VM: [1 65 108 110 111]
Python VM: {0: 1, 1: 65, 2: 108, 3: 108, 4: 1644, 5: 2156, 6: 2172}

Eventhough the sizes of the segments look like this:

Nethermind VM (with builtin runner names):

segment 0 64 no builtin
segment 1 43 no builtin
segment 2 2 range_check
segment 3 1 no builtin

Python VM: (0, 64), (1, 43), (2, 0), (3, 0), (4, 2), (5, 0), (6, 1)], builtin runners keys: {'output_builtin', 'pedersen_builtin', 'range_check_builtin', 'ecdsa_builtin'}

Since the addresses during relocation are transformed into felts, by adding the address offset to the segment base (offset), this should indeed cause the differences in the values of relocated addresses. The differences in the relocated addresses values look like this:

index	python vm	nethermind vm
66	(67, 1644)	(67, 108)
69	(70, 2172)	(70, 110)
72	(73, 2172)	(73, 110)
74	(75, 1644)	(75, 108)
75	(76, 2172)	(76, 110)
82	(83, 1644)	(83, 108)
87	(88, 1644)	(88, 108)
91	(92, 1645)	(92, 109)
96	(97, 1645)	(97, 109)
100	(101, 1646)	(101, 110)
102	(103, 2172)	(103, 110)
103	(104, 1646)	(104, 110)
104	(105, 2172)	(105, 110)
106	(107, 1646)	(107, 110)
107	(1644, 0)	(108, 110)
108	(1645, 0)	(109, 110)
109	(2172, 1)	(110, 110)

During decoding the memory in Nethermind VM, we take the highest memory address as length of the memory. As one can see in the example, the highest memory addresses are not the same, thus the effect of decoded memory filled with nills. From this short analysis, the difference in the memory traces is caused by two problems:

In our integration tests we support only two builtins layouts, "small" (pedersen, range_check, ecdsa) and "plain" (default, no builtins) for running the python-vm. Starkware Cairo VM supports different layouts, with different builtins sets. They are all specified here. As presented in the examples before, all the segments for the builtins, specified in the Python VM layout get allocated, and their segments offsets are calculated as well, even when those segments are empty. In Nethermind VM we include the builtins specified in the builtins field of the compiled program json. This might cause differences in the memory traces, as some segments from given layout might not be present in the builtins field, which will affect the segments offsets. To conclude with a question, should we support layout flags as well?
In Python VM the proposed solution for calculating the final segments offsets for relocation is Finalization. This method produces the relocation offsets for program segment, execution segments and builtins segments. The builtins segments offsets are calculated using this method, which utilize informations about builtin like: ratio (ratio between number of steps and the number of builtins instances), instances_per_component (the number of invocations being handled in each call to the corresponding component), as well as information about the runtime (current step). Currently in Nethermind VM we use the actual sizes of the segments during relocation, for builtins segments as well. This difference in size of the memory traces and the values of relocated addresses is caused mostly by that.

MaksymMalicki commented 1 month ago

Hi @Osatuyi, this issue is currently just a writeup and requires further discussion.

rodrigo-pino commented 1 month ago

Hey @MaksymMalicki thanks for gathering all these information! I have some questions, hope you can answer.

End run What it does - runs, until sufficient number of vm steps have been executed, to allocate needed number of memory cells for the builtins segments.

I am not sure I understand, running steps for the sake of running steps won't allocate extra slots in the builtins as far as I know. Once you hit the end of program running extra steps won't generate any changes on the pc. Also, the runner already runs to the power of 2 when in proof mode.

Helper methods that need to be implemented: check_used_cells check_range_check_usage check_memory_usage check_diluted_check_usage

I am uneasy that you know with such clarity what methods are already needed before writing a solution. Are you by default saying that we should use the Python implementation? If it is the best way to do it for our project, then great, but I am unsure that you've already went through that line of thought.

I think before saying what's missing, we should now why is it missing. Why is it required by the VM, and then how do we solve it. I believe there are subtle but important differences there.

Read return values

Same thing here. We dont need "read return values", I have no idea what does that means. From what I gather by reading the definition, we need to include in the memory trace a pointer to each builtin segment last accessed/allocated value. I dunno this explanation is correct, but at least there was an attempt to understand the VM..

If you don't understand what's missing then it makes sense to just use the python solution and not thinking too much about it. If that's the goal of this VM, then great, but then again, all VMs already do that... .

They need to contain this following information which is currently not present: ratio, cells per instance, instances per component. Maybe turning the BuiltinRunner from interface to datastruct would help.

What is ration, cells per instance and instance per component? Could you expand on what each of those fields represent.

I will stop signalling other stuff. I agree with the diagnosis, I disagree with everything else. All the solutions suggested seems to follow the thought of "We will be implementing it like this, because someone 3 years ago implemented it like this". Not taking into account the current context, the difference between languages and the difference in architecture that both projects have.

Again, that's not a bad approach but I think it is not the correct one, unless the goal is making a mediocre product. And again, if that's the best solution let's implement it that way, but first let's prove that's the best solution and if not then let's create it ourselves. I think that's the right approach.

MaksymMalicki commented 1 month ago

Hi @rodrigo-pino, thanks for the feedback. I rewrote the issue context, focusing more on the diagnosis of the problem and what causes the differences. I agree that the previous version was very suggestive and written with the approach influenced by the existing solutions.

NethermindEth / cairo-vm-go

Differences in decoded memory traces in the integration tests #422