Provide overflow-detecting arithmetic instructions?

One of the biggest shortcomings of RISC-V is the lack of overflow-checked arithmetic, which is essential for reliable software, and is frequently used in Rust, Python and other languages. Obviously it can emulated with sequences of multiple instructions, but that's quite inefficient, since overflow detection is desirable on pretty much all arithmetic instructions.

The bitmanip extension could be a vehicle to add these instructions, since it already seems to carry several "exotic" arithmetic variants.

A potential design would involve adding variants of arithmetic instructions, with the following added fields:

1 bit: detect signed overflow, or overflowing INT_MAX division result in case of division
1 bit: detect unsigned overflow", or division by zero in case of division
1-3 bits: action on overflow/non-overflow
- Call custom saved handler set in overflow3 CSR (modules would be supposed to save and restore the handler when entering their code and when calling other modules)
- Set pc += 2 on non-overflow (allows to specify an arbitrary compressed instruction)
- Call OS-provided handler set in overflow1 CSR
- Set pc += 4 on non-overflow (allows to specify an arbitrary 32-bit instruction)
- Call application-provided handler set in overflow2 CSR (only supposed to be used by the main executable)
- Set pc += 8 on non-overflow (allows to specify two arbitrary instructions)
- Call temporary handler set in overflow4 CSR (this would be freely overwritable by any non-OS code)
- Increment fixed overflow_count CSR on overflow (this would be freely overwritable by anyone)

The CSRs would have one set for each operating mode (user, supervisor, etc.), and would be settable in that mode. The temporary/application/library/OS distinction would be merely an ABI convention, and the "OS" would usually be the libc or dynamic loader on POSIX systems.

If enough coding space is available (e.g. if 48-bit instructions are used), then the design can be amended like this:

Provide a single pc += X mode with an immediate shifted left by 1
Collapse the CSR modes into one with a CSR number field
Provide an extra jump to handler set in GPR with register number field
Provide an extra increment custom GPR mode with register number field
Provide an extra set flag in GPR mode with register number and flag number field

A similar feature could be provided for floating-point instructions too, with a bit for each of the conditions of the IEEE exceptions:

Invalid operation, NaN generated
Division of nonzero by zero
Exponent overflow, infinity or max generated
Exponent underflow, zero or subnormal generated
Inexact result

Separate CSRs could be provided for floating point operations. If there is enough coding space, underflow and/or inexact result could be given an additional set of action fields.

The OS-provided overflow handler should raise the overflow as an exception using the platform's exception-raising mechanism (e.g. DWARF EH or SEH).

A more complicated design is possible, but this might be enough.

Most programming languages that need overflow-checking raise an exception when it has happened.

One method for checking overflow that could work for those languages would be to use instructions for saturating arithmetic, which set a cumulative status flag in a CSR if the value is saturated (i.e. it would have overflowed). You would clear the flag before instructions that would set it (write x0 to the CSR), and test the flag (read CSR to GPR; conditional branch) before any result could cause any side-effects (conditional branch, store, etc...) (This is similar behaviour to how QNaNs are propagated by arithmetic instructions, but raise exceptions with those who have side-effects. A proper hardware implementation like this for integers is patented however, and couldn't be added to RISC-V until that patent expires in 2035)

There are many signed and unsigned scalar add, subtract, absolute value, multiply and shift-instructions in the P-proposal. It also includes a CSR with the cumulative overflow-flag. However, there have been discussions to reduce the extension to its SIMD core, omitting these.

Instructions for saturating signed and unsigned addition and subtraction are already in the Vector extension, but it is limited:

It saturates only the current SEW. If you'd want to use smaller widths, I think you'd have to shift the inputs left before and then the results right afterwards.
There is only a single bit in a CSR, for all vector lanes. It would have been more useful to get a predicate vector that could have been smeared (vmsbf.m) and then used for masked write-back/store. (Compare to how a compiler would vectorise conditional return/break/continue).

riscv / riscv-bitmanip

Provide overflow-detecting arithmetic instructions? #187