0xPolygonMiden / miden-vm

STARK-based virtual machine
MIT License
612 stars 150 forks source link

u32and should accept err #1300

Open hackaugusto opened 3 months ago

hackaugusto commented 3 months ago

Similar to https://github.com/0xPolygonMiden/miden-vm/issues/1264 , but for the instructions u32and. The VM halts if the operands are not u32, and a error code would be benefitial.

bitwalker commented 3 months ago

There are actually quite a few instructions that can trap for various reasons, and have this issue. One can work around this specific case by using u32assert.err=ERR on the operands at least, but that's definitely a pain.

I think the larger issue of being able to pinpoint where an error/assert was triggered will need to be solved with debug info. Without debug info, even when you can specify an error code, it is still not particularly helpful in quickly finding the cause. With debug info, we can at the very least provide the location in the original source code, and in cases where we have sources available, we can even render the source snippet that is relevant. More importantly, it would work well across all instructions, whether the error is explicitly triggered with an assert, or implicitly triggered by the semantics of a specific instruction.

Maybe we should add a tracking issue for that, where we can work out the details, and link all these issues to that one?

hackaugusto commented 3 months ago

@bitwalker sounds good to me 👍

bobbinth commented 3 months ago

One note: adding error codes to instructions which also support immediate values may be a bit tricky (or at least may require modifying syntax a bit). So, this may conflict a bit with #1301.

bitwalker commented 3 months ago

@bobbinth I think it could be an either or situation:

u32assert.err=ERR
u32and.0xFF

OR

u32and.err=ERR

In other words, validating the sole operand for the immediate variant is trivially done with u32assert.err=ERR, so we probably don't need to support both at the same time.

bobbinth commented 3 months ago

We could do the same for both operands:

u32assert2.err=ERR
u32and

And this will actually be faster than using u32assert as u32assert is just syntactic sugar on top of u32assert2 (so, u32assert2 takes 1 cycle while u32assert takes 3 cycles).

bitwalker commented 3 months ago

Ah right, always forget about u32assert2!

Fumuran commented 1 month ago

@bobbinth @bitwalker So what is our approach? Do we stick with an u32assert2 and bitwise ops without immediate values? Or though it is preferable it will be still good to have immediate values for bitwise operations?

bitwalker commented 1 month ago

IMO if one has to choose, it makes sense to support immediates first, error codes second, since one can use assertions before an op to obtain useful errors and therefore get the best of both worlds. But I think a compromise here is to simple make it an either/or situation, i.e. you can either use immediate operands, or specify an error code, but you can't do both. This lets you choose an option that is most appropriate for the situation.

I do think we could support both at the same time, by requiring the .err=CODE to always precede all immediate operands, so you'd have something like u32and.err=EINVAL.0xFF - it's a bit syntactically noisy though, so I wonder if anyone would prefer writing that versus the slightly more verbose, but less dense version with two instructions.

Thoughts?

Fumuran commented 1 month ago

I agree, I think that an either/or compromise is a good solution for now indeed.

Supporting them both at the same time looks a little overloaded for me, but I can't come up with a better solution. Also I think that an ordering of the immediate value and an error code can be any: AFAIK lalrpop allows to parse them in any order. Motivation for this is that user probably won't remember the correct ordering, so it will be more user-friendly to support both. But it will make parsing more complex, so I don't have a strong preference here.

bobbinth commented 1 month ago

Here are my thoughts:

  1. I don't think we need to support syntax where we specify both an immediate value and an error code.
  2. We could support either/or (i.e., either immediate value or error code), but I'm also not sure the extra complexity is worth the benefit we are getting.
  3. We could also just keep things as they are. This does not add any more complexity to the parser/assembler, and does not make things any more complicated for the user while adding only 1 extra cycle. Specifically:
u32assert2.err=SOME_ERROR
u32and

Is not all that much worse than:

u32and.err=SOME_ERROR

And actually makes the condition for the error a bit more explicit.

Basically, unless someone has a strong preference for the second option above, I'd rather just close this issue.