tmk / tmk_keyboard

Keyboard firmwares for Atmel AVR and Cortex-M
3.98k stars 1.7k forks source link

IBMPC: Read pin state earlier with Naked ISR #717

Open tmk opened 2 years ago

tmk commented 2 years ago

We have to read data line as early as possible for IBM XT keyboard within 5us.

https://github.com/tmk/tmk_keyboard/wiki/IBM-PC-XT-Keyboard-Protocol#note-for-start0 https://github.com/tmk/tmk_keyboard/wiki/IBM-PC-XT-Keyboard-Protocol#isr-prologue

With normal ISR its prologue(pushing registers) is inevitable and it can consume tens of clocks before reading data pin. We have to read the pin before the prologue especially when ISR requires many registers and prologue is long.

To circumvent the ISR prologue we can use naked ISR which doesn't generate prologue. But there is no general purpose register(r0-r31) to store pin store safely in naked ISR.

One of possible workaround would be to store data pin state(PIND) temporarily in unused IO register(0x01) instead of general purpose register. ATMega32u2/4 has no PORTA but 0x01(DDRA) can seem to be used for this purpose. DDRE is another candidate.

// IO address to store pin state temporarily(DDRA: not exist on 32u2 and 32u4)
#define STORED_PIN    _SFR_IO8(0x01)

ISR(IBMPC_INT_VECT, ISR_NAKED)
{
    asm volatile (
        "push   r0"                     "\n\t"
        "in     r0,     %[pin]"         "\n\t"
        "out    %[sto], r0"             "\n\t"
        "pop    r0"                     "\n\t"
        "rjmp   ibmpc_isr"              "\n\t"
        :
        : [pin] "I" (_SFR_IO_ADDR(PIND)),
          [sto] "I" (_SFR_IO_ADDR(STORED_PIN))
    );
}

// define normal ISR
extern "C" void ibmpc_isr(void) __attribute__ ((signal,__INTR_ATTRS));
void ibmpc_isr(void)
{
    ...
    if (STORED_PIN & data_mask) ...
    ...
}

This naked ISR can reads the pin in 10 cycles. [5(interrupt) + 3(jmp in vector table) + 2(push r0)] One cycle time is 62.5ns at 16MHz.

ATmeaga32U4 datasheet:

4.8.1
Interrupt Response Time
The interrupt execution response for all the enabled AVR interrupts is five clock cycles minimum.
After five clock cycles the program vector address for the actual interrupt handling routine is exe-
cuted. During these five clock cycle period, the Program Counter is pushed onto the Stack. The
vector is normally a jump to the interrupt routine, and this jump takes three clock cycles. If an
interrupt occurs during execution of a multi-cycle instruction, this instruction is completed before
the interrupt is served. If an interrupt occurs when the MCU is in sleep mode, the interrupt exe-
cution response time is increased by five clock cycles. This increase comes in addition to the
start-up time from the selected sleep mode.
A return from an interrupt handling routine takes five clock cycles. During these five clock cycles,
the Program Counter (three bytes) is popped back from the Stack, the Stack Pointer is incre-
mented by three, and the I-bit in SREG is set.

https://github.com/tmk/tmk_keyboard/tree/ibmpc_naked_isr_io_reg

tmk commented 2 years ago

Another workaound would be fixed register. https://gcc.gnu.org/wiki/avr-gcc#Fixed_Registers https://gcc.gnu.org/onlinedocs/gcc/Global-Register-Variables.html#Global-Register-Variables

GCC option -ffixed-<reg> is required to define a fixed register. https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html

Note that the option doesn't effect precompiled libraries, which can already use the fixed register. Actually avr-libc can use all registers, for example r2 is referred in some functions we don't need to link at this time, like qsort, longjump, setjmp, strftime and vfprintf. This will causes problem if those fuctions are used. In that case we have to save and restore the register.

// register to store pin state temporarily
volatile register uint8_t STORED_PIN    asm("r2");

ISR(IBMPC_INT_VECT, ISR_NAKED)
{
    asm volatile (
        "in     r2,     %[pin]"         "\n\t"
        "rjmp   ibmpc_isr"              "\n\t"
        :
        : [pin] "I" (_SFR_IO_ADDR(PIND))
    );
}

// define normal ISR
extern "C" void ibmpc_isr(void) __attribute__ ((signal,__INTR_ATTRS));
void ibmpc_isr(void)
{
    ...
    if (STORED_PIN & data_mask) ...
    ...
}

Makefile:

EXTRAFLAGS ?= -ffixed-r2

This naked ISR can reads the pin in 8 cycles. [5(interrupt) + 3(jmp in vector table)] 62.5ns * 8 = 500ns

https://github.com/tmk/tmk_keyboard/tree/ibmpc_naked_isr_fixed_reg