noir-lang / noir

Noir is a domain specific language for zero knowledge proofs
https://noir-lang.org
Apache License 2.0
821 stars 177 forks source link

incorrect inference from conditional assertion #5344

Open nventuro opened 4 days ago

nventuro commented 4 days ago

Aim

I run into an extremely confusing scenario in which the presence of assert_eq statements seems to put the program in an inconsistent state in which variables are equal to multiple values at the same time, or have different values depending on how they are inspected.

Expected Behavior

The following example may seem a bit convoluted, but it originates from a legitimate use-case in the Aztec codebase, and it seems all of the elements present here are required to trigger the bug.

trait ToField {
    fn to_field(self) -> Field;
}

impl ToField for bool { fn to_field(self) -> Field { self as Field } }

unconstrained fn get_unconstrained_option() -> Option<u32> {
    Option::some(13)
}

unconstrained fn should_i_assert() -> bool {
    false
}

fn get_magical_boolean() -> bool {
    let option = get_unconstrained_option();

    let pre_assert = option.is_some().to_field();

    if should_i_assert() {
        // Note that `should_i_assert` is unconstrained, so Noir should not be able to infer
        // any behavior from the contents of this block. In this case it is actually false, so the
        // assertion below is not even executed (if it did it'd fail since the values are not equal).

        assert_eq(option, Option::some(42)); /// <- this seems to be the trigger for the bug
    }

    // In my testing, the `option` value exhibits weird behavior from this point on, as if it had been mutated
    let post_assert = option.is_some().to_field();

    // The following expression should be true, but I can get it to evaluate to false depending on how I call it
    pre_assert == post_assert
}

#[test]
fn test() {
    let magic = get_magical_boolean();

    // One of these asserts should fail, but both pass.
    assert_eq(magic, true);
    assert_eq(magic, false);
}

As mentioned in the comments, nargo test passes, so magic is both true and false.

Bug

It took me a fair bit of effort to realize that this was a Noir issue, and then to further narrow down the issue to the code sample above. This was quite hard to analyse as it seems the results I get from testing depend on how I inspect the values. I'll share here some of my findings, hoping they might be useful to whomever looks into this.

The trigger for the weird behavior seems to be the assert_eq statement. I don't think this is a red-herring: I did not find a single instance of strange behavior that did not include the assertion.

It looks like the presence of assert_eq somehow confuses Noir when it comes to what the value of the thing being asserted is. Note that in my example option is a some value, I then do assert_eq to a different some value, and yet I later get that is_some is not the same before and after the assert_eq statement. In the example above I only produce a magic boolean that is both true and false, but in testing with extended infrastructure I got Noir to emit oracle calls in which pre_assert is 1 (as expected) and post_assert is 0. This is why I had to use to_field: my logging function only receives fields.

Making the unconstrained functions constrained causes for the issue to not occur, so this must be somehow related.

I also noticed that performing boolean comparison instead of using to_field seemingly causes the issue to go way, but I'm not sure this is entirely true. I only called to_field in order to get logging working (and to produce this failing example), but in my original code this was not present.

Project Impact

Blocker

Nargo Version

nargo version = 0.30.0 noirc version = 0.30.0+5c3772f09e812a05882039ad888089c238f14001 (git version hash: 5c3772f09e812a05882039ad888089c238f14001, is dirty: false)

TomAFrench commented 4 days ago

Spoke in DMs. This is a duplicate of #5202 but with a much nicer reproduction :medal_sports:

nventuro commented 3 days ago

Given that we now have a smaller failing example, perhaps this could be used to look into why the optimization disabled in https://github.com/noir-lang/noir/pull/5240 was faulty so that it can be fixed instead of removed? That'd also increase our confidence in the fix if we understood exactly why things broke.