qmk / qmk_firmware

Open-source keyboard firmware for Atmel AVR and Arm USB families
https://qmk.fm
GNU General Public License v2.0
17.99k stars 38.65k forks source link

[Bug] SPLIT_KEYBOARD doesn't work with Teensy LC and CH_DBG_SYSTEM_STATE_CHECK == TRUE #14821

Open bofh69 opened 2 years ago

bofh69 commented 2 years ago

Describe the Bug

With CH_DBG_SYSTEM_STATE_CHECK defined as TRUE, SPLIT_KEYBOARD doesn't work with Teensy LC with a setup derived from the handwired/onekey/teensy_lc example.

The system seems to work fine fine if I turn off CH_DBG_SYSTEM_STATE_CHECK

I can see in dmesg on my ubuntu box that the usb device is detected, but five seconds later it says "can't add hid device: -32" and it doesn't work.

It seems osalSysLock is somehow called twice before the unlock call is called.

I've copied handwired/onekey/teensy_lc and turned it into a split keyboard setup with two rows, one column. (One row per keyboard half).

I've added:

    SerialConfig s0cfg = {230400};
    void usart_init(void)
    {
        sdStart(&SD1, &s0cfg);
    }

to my keyboard.c file.

By toggling the LED I've managed to track down the hang to when usart_clear is called, it never gets past the osalSysLock call (if I follow all the calls correctly, it is the _dbg_check_lock(); that doesn't return). If I turn off CH_DBG_SYSTEM_STATE_CHECK, then it seems to work, I can at least input both of my keys.

I've also seen that the Hardfault-, MemFault- or BusFault_-Handlers are not called.

My config.h contains:

#define PRODUCT Los Teclados
#define MATRIX_ROWS  2
#define MATRIX_COLS  1
#define MATRIX_COL_PINS { D0 }
#ifdef SPLIT_KEYBOARD
#define MATRIX_ROW_PINS { C3 }
#else
#define MATRIX_ROW_PINS { C3, C4 }
#endif
#define DIODE_DIRECTION COL2ROW
#define UNUSED_PINS
#define SPLIT_HAND_PIN C0
#define SERIAL_USART_FULL_DUPLEX
#define SERIAL_USART_TX_PIN B16
#define SERIAL_USART_RX_PIN B17
#define SELECT_SOFT_SERIAL_SPEED 1
#define SERIAL_USART_DRIVER SD1
#define SERIAL_USART_TX_PAL_MODE 3
#define SERIAL_USART_RX_PAL_MODE 3
#define SERIAL_USART_TIMEOUT 20
#define SERIAL_USB_BUFFERS_SIZE 128 // I32 with the original stack sizes
#define SERIAL_BUFFERS_SIZE 128 // I32 with the original stack sizes.
#define SERIAL_USART_CONFIG { SERIAL_USART_SPEED }

rules.mk:

BOOTMAGIC_ENABLE = no       # Enable Bootmagic Lite
MOUSEKEY_ENABLE = no        # Mouse keys
EXTRAKEY_ENABLE = yes       # Audio control and System control
CONSOLE_ENABLE = yes        # Console for debug
COMMAND_ENABLE = no         # Commands for debug and configuration
SLEEP_LED_ENABLE = no       # Breathing sleep LED during USB suspend
NKRO_ENABLE = yes           # USB Nkey Rollover
BACKLIGHT_ENABLE = no       # Enable keyboard backlight functionality
RGBLIGHT_ENABLE = no        # Enable keyboard RGB underglow
AUDIO_ENABLE = no           # Audio output
SPLIT_KEYBOARD = yes        # Split common
DEFAULT_FOLDER = los_teclados/teensy_lc
SERIAL_DRIVER = usart
LAYOUTS = ortho_2x1

hal_config.h:

#define HAL_USE_SERIAL TRUE

mcu_conf.h:

#define KL2x_MCUCONF
#define KINETIS_MCG_MODE            KINETIS_MCG_MODE_PEE
#define KINETIS_PLLCLK_FREQUENCY    96000000UL
#define KINETIS_SYSCLK_FREQUENCY    48000000UL
#define KINETIS_SERIAL_USE_UART0              TRUE
#define KINETIS_USB_USE_USB0                  TRUE
#define KINETIS_USB_USB0_IRQ_PRIORITY         2
#define KINETIS_I2C_USE_I2C0                FALSE
#define KINETIS_I2C_I2C0_PRIORITY           4

chconf.h is identical to handwired/onekey/teensy_lc's.

the other rules.mk file is:

MCU = MKL26Z64
USE_CHIBIOS_CONTRIB = yes
OPT_DEFS += -DCORTEX_ENABLE_WFI_IDLE=TRUE

I've also created a new ortho_1x2 layout with two rows, one column.

I've decreased the stack use by some threads (the same error happens with smaller USB/SERIAL buffers and the original sizes of the stacks too): THD_WORKING_AREA(waSlaveThread, 512); THD_WORKING_AREA(serialThreadStack, 512); USE_PROCESS_STACKSIZE = 0x700

System Information

Additional Context

bofh69 commented 2 years ago

Digging more I found this:

The TRANSACTION_HANDLER_MASTER/TRANSACTION_HANDLER_SLAVE macros uses ATOMIC_BLOCK_FORCEON when calling their handler functions. ATOMIC_BLOCK_FORCEON uses chSysLock() to implement the locking, thus leading to the second locking in the usart_clear code.

Just removing osalSysLock() from usart_clear() and adding it around the calls to usart_clear() in SlaveThread isn't enough though to fix the original problem. The code hangs somewhere else instead.

I think this needs to be investigated on a devboard with a debug probe so one can get proper stack traces.

The proper solution is probably to use reentrant locks instead. People always make these misstakes with non-reentrant locks. That is probably a harder fix though.