raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.05k stars 4.96k forks source link

RPI5 UART not working after reboot/ upgrade of Kernel #6374

Closed idulkoan closed 3 days ago

idulkoan commented 4 days ago

Describe the bug

After updating my Raspberry Pi 5 to kernel version 6.6.51-v8-16k+ #1 SMP PREEMPT, the UART communication between the Pi and the BigTreeTech SKR Mini E3 V2 board over GPIO14 (TX) and GPIO15 (RX) pins (pins 08 and 10) has become unreliable. It generally does not connect at all. The printer cannot connect via UART, leading to the error mcu 'mcu': Unable to connect. A UART loopback test using a Python script consistently fails, indicating an underlying communication issue with the UART.

Linux Ender5Pro 6.6.51-v8-16k+ #1 SMP PREEMPT Wed Sep 18 16:24:48 UTC 2024 aarch64 GNU/Linux

###### Identifying MCU possibly connected via UART ...

 ● MCU #1: /dev/ttyS0
 ● MCU #2: /dev/ttyAMA0
mcu 'mcu': Unable to connect
Once the underlying issue is corrected, use the
"FIRMWARE_RESTART" command to reset the firmware, reload the
config, and restart the host software.

Steps to reproduce the behaviour

Interestingly, after using sudo rpi-update pulls/6371 and restarting, I notice that with this version, If I let it sit for a long time, I eventually do get text on the printer screen randomly and have had a very brief window where I was able to start the homing process, but then it just dies completely. Its very random, and has only happened like 2-3 times in my testing.

I have created this python script:

pi@Ender5Pro:~/Downloads $ cat uart_test.py
import serial
import time

# Configure the serial port
port = '/dev/ttyAMA0'
baudrate = 250000
timeout = 1  # seconds

# Initialize the serial connection
ser = serial.Serial(port, baudrate, timeout=timeout)

# Function to write and read data
def test_uart():
    test_message = "Hello, UART Loopback!"
    ser.write(test_message.encode())  # Send data
    time.sleep(1)  # Wait for a short moment
    ser.flushInput()  # Clear the input buffer
    response = ser.read(len(test_message)).decode()  # Read the response

    if response == test_message:
        print("UART loopback test passed!")
    else:
        print("UART loopback test failed.")
        print(f"Sent: {test_message}")
        print(f"Received: {response}")

# Run the test
if __name__ == "__main__":
    test_uart()

# Close the serial port
ser.close()

To help test UART loopback after shorting pins 8 and 10 (TX RX) , But never had a successful result from the test everytime I have tried.

I have also tried

sudo screen /dev/ttyAMA0 250000
sudo picocom -b 250000 /dev/ttyAMA0
sudo minicom -b 250000 -o -D /dev/ttyAMA0

But have been unable to connect or get loopback, which indicates that the RPI5 is the issue here , maybe due to kernel upgrade. I have not noted what my previous kernel was.

Device (s)

Raspberry Pi 5

System

pi@Ender5Pro:~/Downloads $ cat /etc/rpi-issue
Raspberry Pi reference 2023-12-05
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 70cd6f2a1e34d07f5cba7047aea5b92457372e05, stage4
pi@Ender5Pro:~/Downloads $ vcgencmd version
2024/09/10 14:40:30
Copyright (c) 2012 Broadcom
version 5be4f304 (release) (embedded)
pi@Ender5Pro:~/Downloads $ uname -a
Linux Ender5Pro 6.6.51-v8-16k+ #1 SMP PREEMPT Wed Sep 18 16:24:48 UTC 2024 aarch64 GNU/Linux

Logs

No response

Additional context

No response

popcornmix commented 4 days ago

Duplicate of https://github.com/raspberrypi/linux/issues/6365

The fix is here: https://github.com/raspberrypi/linux/pull/6371

and is available in rpi-update kernel.

idulkoan commented 4 days ago

As mentioned, sudo rpi-update pulls/6371 Did not fix the issue for me

pelwell commented 3 days ago

There is a different potential fix for the same problem that you can try:

sudo rpi-update pulls/6377
idulkoan commented 3 days ago

First of all, thanks for your time and response. Curious as to what has changed in the recent updates.

I have updated and rebooted:

sudo rpi-update pulls/6377
sudo reboot

For the first boot, briefly the printer was working (Text on LCD screen, just about able to click home on the fluidd dashboard) Then, again, serial connection lost. Unable to reconnect, even after rebooting

Here are the moonraker logs:

2024-09-19 21:50:48,107 [extensions.py:start_unix_server()] - Creating Unix Domain Socket at '/home/pi/printer_data/comms/moonraker.sock'
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - PackageKit: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 1 Second
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Git Repo moonraker: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 1 Second
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Git Repo klipper: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 2 Seconds
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Web Client fluidd: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 2 Seconds
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Git Repo crowsnest: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 2 Seconds
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Git Repo pgcode: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 3 Seconds
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Web Client mainsail: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 3 Seconds
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Git Repo mobileraker: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 3 Seconds
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Git Repo timelapse: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 4 Seconds
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Git Repo Klipper-Adaptive-Meshing-Purging: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 4 Seconds
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Git Repo octoeverywhere: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 5 Seconds
2024-09-19 21:50:48,108 [base_deploy.py:log_info()] - Git Repo moonraker-obico: Next refresh in: 6 Days, 14 Hours, 21 Minutes, 5 Seconds
2024-09-19 21:50:48,108 [server.py:start_server()] - Starting Moonraker on (0.0.0.0, 7125), Hostname: Ender5Pro
2024-09-19 21:50:48,108 [application.py:listen()] - SSL Certificate/Key not configured, aborting HTTPS Server startup
2024-09-19 21:50:48,359 [klippy_connection.py:_do_connect()] - Klippy Connection Established
2024-09-19 21:50:48,384 [server.py:add_log_rollover_item()] -
Systemd unit klipper.service:
unit_name: klipper.service
is_default: True
manager: systemd
Properties:
**Requires=['-.mount', 'system.slice', 'sysinit.target']
**After=['sysinit.target', 'systemd-journald.socket', 'basic.target', '-.mount', 'system.slice', 'network-online.target', 'klipper-mcu.service']
**SupplementaryGroups=[]
**EnvironmentFiles=/home/pi/printer_data/systemd/klipper.env
**ExecStart=/home/pi/klippy-env/bin/python $KLIPPER_ARGS
**WorkingDirectory=/home/pi/klipper
**FragmentPath=/etc/systemd/system/klipper.service
**Description=Klipper 3D Printer Firmware SV1
**User=pi
2024-09-19 21:50:48,494 [authorization.py:_check_trusted_connection()] - Trusted Connection Detected, IP: [REDACTED]
2024-09-19 21:50:48,494 [application.py:log_request()] - 101 GET /websocket ([REDACTED]) [_TRUSTED_USER_] 1.18ms
2024-09-19 21:50:48,495 [websockets.py:open()] - Websocket Opened: ID: 140735969902800, Proxied: True, User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36, Host Name: [REDACTED]
2024-09-19 21:50:48,500 [websockets.py:_handle_identify()] - Websocket 140735969902800 Client Identified - Name: OctoEverywhere-BrowserAgent, Version: 1.0.0, Type: web
2024-09-19 21:50:48,581 [file_manager.py:complete_file_write()] - Invalid file write event: ID
2024-09-19 21:50:48,636 [server.py:add_log_rollover_item()] - Klipper Version: v0.12.0-302-g87ac6936
2024-09-19 21:50:49,381 [authorization.py:_check_trusted_connection()] - Trusted Connection Detected, IP: 127.0.0.1
2024-09-19 21:50:49,381 [application.py:log_request()] - 101 GET /websocket (127.0.0.1) [_TRUSTED_USER_] 1.05ms
2024-09-19 21:50:49,381 [websockets.py:open()] - Websocket Opened: ID: 140735971077328, Proxied: False, User Agent: , Host Name: 0.0.0.0
2024-09-19 21:50:49,385 [websockets.py:_handle_identify()] - Websocket 140735971077328 Client Identified - Name: OctoEverywhere, Version: 3.6.1, Type: agent
2024-09-19 21:50:50,730 [server.py:add_log_rollover_item()] - Connected to Spoolman Spool Manager
2024-09-19 21:50:50,766 [spoolman.py:_check_spool_deleted()] - Found Spool ID 17 on spoolman instance
2024-09-19 21:50:54,125 [application.py:log_request()] - 101 GET /websocket?token=[REDACTED] (127.0.0.1) [_TRUSTED_USER_] 1.48ms
2024-09-19 21:50:54,125 [websockets.py:open()] - Websocket Opened: ID: 140735969889552, Proxied: False, User Agent: , Host Name: 127.0.0.1
2024-09-19 21:50:54,965 [file_manager.py:get_file_list()] - Updating File List <gcodes>...
2024-09-19 21:50:55,680 [klippy_connection.py:_request_initial_subscriptions()] - Webhooks Subscribed
2024-09-19 21:50:55,681 [klippy_connection.py:_request_initial_subscriptions()] - GCode Output Subscribed
2024-09-19 21:50:55,930 [job_state.py:_handle_started()] - Job state initialized: standby
2024-09-19 21:50:56,184 [klippy_connection.py:_check_ready()] - Klippy ready
2024-09-19 21:50:56,682 [data_store.py:_init_sensors()] - Configuring available sensors: ['temperature_sensor raspberry_pi', 'temperature_sensor mcu_temp', 'heater_bed', 'extruder']
2024-09-19 21:51:03,593 [application.py:log_request()] - 101 GET /websocket?token=[REDACTED] ([REDACTED]) [_TRUSTED_USER_] 0.56ms
2024-09-19 21:51:03,594 [websockets.py:open()] - Websocket Opened: ID: 140735972819984, Proxied: False, User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36, Host Name: [REDACTED]
2024-09-19 21:51:03,601 [websockets.py:_handle_identify()] - Websocket 140735972819984 Client Identified - Name: Fluidd, Version: 1.19.0, Type: web
2024-09-19 21:51:15,380 [application.py:log_request()] - 404 GET /server/database/item?namespace=mainsail&key=presets (127.0.0.1) 20.64ms
2024-09-19 21:51:15,380 [application.py:log_request()] - 101 GET /websocket?token=[REDACTED] (127.0.0.1) [_TRUSTED_USER_] 1.69ms
2024-09-19 21:51:15,380 [websockets.py:open()] - Websocket Opened: ID: 140735971621808, Proxied: False, User Agent: , Host Name: 127.0.0.1
2024-09-19 21:51:15,382 [websockets.py:_handle_identify()] - Websocket 140735971621808 Client Identified - Name: Moonraker-Companion, Version: 1.4.2, Type: agent
2024-09-19 21:51:15,555 [websockets.py:_handle_close()] - Websocket Closed: ID: 140735971621808, Code: 1000, Reason: success
2024-09-19 21:55:47,690 [device_power.py:power_device()] - Powering off device [REDACTED] due to klippy shutdown
2024-09-19 21:55:47,792 [device_power.py:power_device()] - Powering off device [REDACTED] due to klippy shutdown

We can see the following behaviour:

Services Starting and Monitoring:

Klipper is started on an Ender 5 Pro printer.
Moonraker is started, listening on port 7125 (Creating Unix Domain Socket and Starting Moonraker).
Various repositories like moonraker, klipper, and others are being monitored for updates and will refresh in about 6 days.

Websocket Connections: Multiple websocket connections are being opened and managed. Some are trusted connections from local IPs, identified as browsers or agents (e.g., OctoEverywhere and Fluidd). These connections allow real-time interaction with the printer's status, including monitoring print jobs, temperatures, etc.

Klipper Connection Established: The Klipper firmware connected successfully (Klippy Connection Established), and the system is initialized, subscribing to necessary services and ready to begin printing (Klippy ready).

Power and Shutdown Handling: At the end of the log, Klipper shuts down (Klippy has shutdown), triggering the shutdown of connected devices, such as the printer and LEDs (Powering off device due to Klippy shutdown). We can ignore the powering off of devices, but Klipper was not meant to shut down/disconnect.

I will now attempt to short TX and RX to run pi-specific diagnostics, such as loopback tests, but I suspect the issue still remains.

idulkoan commented 3 days ago

Interesting results on the loopback test (see Script in original post/ticket details)

pi@Ender5Pro:~/Downloads $ sudo python uart_test.py
UART loopback test failed.
Sent: Hello, UART Loopback!
Received:
Debian GNU/Linux 12
pi@Ender5Pro:~/Downloads $ sudo python uart_test.py
UART loopback test failed.
Sent: Hello, UART Loopback!
Received:
pi@Ender5Pro:~/Downloads $ sudo python uart_test.py
UART loopback test failed.
Sent: Hello, UART Loopback!
Received:

(Yes, it received/read all those carriage returns)

pi@Ender5Pro:~/Downloads $ sudo python uart_test.py
UART loopback test failed.
Sent: Hello, UART Loopback!
Received:
UART loopback test failed.
Sent: Hello, UART Loopback!
Received:
Login incorrect
Ende
idulkoan commented 3 days ago

Revised Script:

import serial
import time

# Configure the serial port
port = '/dev/ttyAMA0'
baudrate = 250000
timeout = 2  # Increased timeout to 2 seconds for slower communication

# Initialize the serial connection
ser = serial.Serial(port, baudrate, timeout=timeout)

# Function to write and read data
def test_uart():
    test_message = "Hello, UART Loopback!"

    # Check if the port is open, if not, open it
    if not ser.is_open:
        ser.open()

    try:
        ser.write(test_message.encode())  # Send data
        time.sleep(0.5)  # Shorter wait time, adjust if needed
        response = ser.read(len(test_message)).decode()  # Read the response

        if response == test_message:
            print("UART loopback test passed!")
        else:
            print("UART loopback test failed.")
            print(f"Sent: {test_message}")
            print(f"Received: {response}")
    except serial.SerialException as e:
        print(f"Error during UART communication: {e}")

# Run the test
if __name__ == "__main__":
    test_uart()

# Close the serial port
ser.close()

I seem to now be getting mixed loopback passes and failures. Could be that my script isn't entirely reliable here

Any suggestions to further test the reliability and functionality of my TX/RX Pins/comms?

Here is another approach:

#pyserial.py
import serial
import time

# Configure the serial port
port = '/dev/ttyAMA0'  # or '/dev/serial0' on some systems
baudrate = 250000
timeout = 1  # seconds

# Initialize the serial connection
with serial.Serial(port, baudrate, timeout=timeout) as ser:

    # Function to write and read data
    def test_uart():
        test_message = "Testing UART Loopback"
        ser.write(test_message.encode())  # Send data
        time.sleep(1)  # Wait for a short moment

        response = ser.read(len(test_message)).decode()  # Read the response

        if response == test_message:
            print("UART loopback test passed!")
        else:
            print("UART loopback test failed.")
            print(f"Sent: {test_message}")
            print(f"Received: {response}")

    # Run the test
    test_uart()

Results:

pi@Ender5Pro:~/Downloads $ sudo python pyserial.py
UART loopback test passed!
pi@Ender5Pro:~/Downloads $ sudo python pyserial.py
UART loopback test passed!
pi@Ender5Pro:~/Downloads $ sudo python pyserial.py
UART loopback test passed!
pi@Ender5Pro:~/Downloads $ sudo python pyserial.py
UART loopback test failed.
Sent: Testing UART Loopback
Received:

Ender5Pro login:
pi@Ender5Pro:~/Downloads $ sudo python pyserial.py
UART loopback test failed.
Sent: Testing UART Loopback
Received:
pi@Ender5Pro:~/Downloads $ sudo python pyserial.py
UART loopback test failed.
Sent: Testing UART Loopback
Received:
pi@Ender5Pro:~/Downloads $ sudo python pyserial.py
UART loopback test passed!
pi@Ender5Pro:~/Downloads $ sudo python pyserial.py
UART loopback test failed.
Sent: Testing UART Loopback
Received:

Results are not consistent..

idulkoan commented 3 days ago

I believe it is fixed now! My issue was split into 2:

UART not working - this was indeed fixed by: sudo rpi-update pulls/6377 Thanks @pelwell !

From my previous tests we can see UART loopback failing and succeeding. As suspected, the failures can be due to the python script itself. The fact that I was getting succeeded loopbacks after the update proved that the functionality was there.

Permission issues: So I had to double-check my symbolic links and .. well klippy.log wasn't where I expected it.

I cleared out my logs directory rm -rf ~/printer_data/logs

Re-created my symbolic link: ln -s ~/klipper_logs ~/printer_data/logs

pi@Ender5Pro:~/printer_data/logs $ ls -l ~/printer_data/logs
lrwxrwxrwx 1 pi pi 21 Sep 19 23:23 /home/pi/printer_data/logs -> /home/pi/klipper_logs

And cleared the Klipper log:

sudo systemctl stop klipper
>klippy.log
sudo systemctl start klipper

Found this: tail -f klippy.log

mcu 'mcu': Unable to open serial port: [Errno 13] could not open port /dev/ttyAMA0: [Errno 13] Permission denied: '/dev/ttyAMA0'
webhooks client 140733662810000: New connection
webhooks client 140733662810000: Client info {'program': 'Moonraker', 'version': 'v0.9.3-0-g71f9e67'}
mcu 'mcu': Unable to open serial port: [Errno 13] could not open port /dev/ttyAMA0: [Errno 13] Permission denied: '/dev/ttyAMA0'
mcu 'mcu': Unable to open serial port: [Errno 13] could not open port /dev/ttyAMA0: [Errno 13] Permission denied: '/dev/ttyAMA0'

mcu 'mcu': Unable to open serial port: [Errno 13] could not open port /dev/ttyAMA0: [Errno 13] Permission denied: '/dev/ttyAMA0'
mcu 'mcu': Unable to open serial port: [Errno 13] could not open port /dev/ttyAMA0: [Errno 13] Permission denied: '/dev/ttyAMA0'

So we're now down to a permissions issue.

pi@Ender5Pro:~/printer_data/logs $ ls -l /dev/ttyAMA0
crw--w---- 1 root tty 204, 64 Sep 19 23:25 /dev/ttyAMA0

Group should be dialout, not tty

sudo chown root:dialout /dev/ttyAMA0
sudo chmod 660 /dev/ttyAMA0
sudo usermod -aG dialout pi
sudo reboot

The permissions did not stick.

pi@Ender5Pro:~/printer_data/logs $ groups pi
pi : pi adm tty dialout cdrom sudo audio video plugdev games users input render netdev spi i2c gpio lpadmin docker network

pi@Ender5Pro:~/printer_data/logs $ ls -l /dev/ttyAMA0
crw--w---- 1 root tty 204, 64 Sep 19 23:28 /dev/ttyAMA0

Found this post: Reddit - Klipper UART

And Ran the following:

sudo systemctl stop [serial-getty@ttyAMA0.service
sudo systemctl disable serial-getty@ttyAMA0.service

I had to do this because this process was blocking the port.

Repeated the steps:

sudo chown root:dialout /dev/ttyAMA0
sudo chmod 660 /dev/ttyAMA0
sudo usermod -aG dialout pi
sudo reboot

This time the group stuck:

pi@Ender5Pro:~/printer_data/logs $ ls -l /dev/ttyAMA0
crw-rw---- 1 root dialout 204, 64 Sep 19 23:52 /dev/ttyAMA0

In Klippy.log:

Loaded MCU 'mcu' 119 commands (v0.12.0-302-g87ac6936 / gcc: (15:12.2.rel1-1) 12.2.1 20221205 binutils: (2.40-2+18+b1) 2.40)
MCU 'mcu' config: ADC_MAX=4095 BUS_PINS_i2c1=PB6,PB7 BUS_PINS_i2c1a=PB8,PB9 BUS_PINS_i2c2=PB10,PB11 BUS_PINS_spi1=PA6,PA7,PA5 BUS_PINS_spi1a=PB4,PB5,PB3 BUS_PINS_spi2=PB14,PB15,PB13 BUS_PINS_spi3=PB4,PB5,PB3 CLOCK_FREQ=72000000 INITIAL_PINS=!PA14 MCU=stm32f103xe PWM_MAX=255 RECEIVE_WINDOW=192 RESERVE_PINS_serial=PA10,PA9 SERIAL_BAUD=250000 STATS_SUMSQ_BASE=256 STEPPER_BOTH_EDGE=1
mcu_temperature 'mcu' nominal base=357.558140 slope=-767.441860
Sending MCU 'mcu' printer configuration...
Configured MCU 'mcu' (1024 moves)

Everything is now working!