add Branch Bit Set Instruction to Fast Interrupt Repertoire

Critical code sections in interrupt routines operate under various constraints to provide optimal response and throughput:

fewest saved registers (to stack and xscratches)
fewest active registers (to reduce preemption overhead)
fewest instructions

TYesting bits in such an environment is expensive in register usage and instructions.

The bit manipulation TG rotate could be used as non-destructive approach to move desired bit to sign to branch relative to zero. but it requires two instructions to return bits to normal which may require disabling interrupts for the duration.

A single Branch Bit Set avoids these overheads.

Suggested formulation is use of brownfield minor op in branch op code.

positive 8 bit offset from pc in 16bit packages, bits 8 -11 and 25-28 this would provide a +512 PC relative forward branch range
s1 (as 5 bits, any of the 31) to be tested and
a 5 bit immediate in rs2 to select the bit to test (zero the least significant , consecutively from there)

This would use the same decoding as the branch instructions

opcode=1100011,
with funct3=011
rs2 the bit selector immediate field
and remaining bits (31-29 and 7) are zero.

in RV32 the value 31 is redundant with sign checking to zero, but more useful in RV64. For RV64 we could consider a 6bit variant, perhaps incorporating it 7?.

8 bits for 16bit package offsets is sufficient for most fast interrupt handler code that is , by nature, compact. All 31 registers may not be needed. sp and rp (X2 and X1) are heavily used, but so are the Compressed register set (x8 through x15) as the use of compressed instructions helps with code locality related to cache size and cache line.

The lower bits are in general more valuable to test as 1) low bit in code addresses can be used as flags, ignored by xret and jumps. (jalr a possible exception). 2) most csrs map significant bits to low bits, partially to allow the 5bit set/clear instructions useful. 3) the most significant bit is already directly testable with the signed branch. 4) rarely are more than a few bits needed in interrupt handlers as the complexity of the code is limited. the branch variations goes up by the power of two for each additional bit.

Because this encoding stipulates the remaining bits re zeroed, these are effectively reserved; potentially for multi bit test variants.

There is a tipping point at which vectoring on the "embedded" value is more effective than single bit checks. (eg. branch tables)

Thus, testing sets of bits may be more valuable then being able to test them all. However, it is valuable to have better decoding than branching on anded low bits with immediate test field. i.e. testing that any of all the bits are set is a low frequency check for control transfer. Similarly all but one set is similarly of low value..

David-Horner / text-format

add Branch Bit Set Instruction to Fast Interrupt Repertoire #14