hardbyte / python-can

The can package provides controller area network support for Python developers
https://python-can.readthedocs.io
GNU Lesser General Public License v3.0
1.28k stars 597 forks source link

SocketCan eventually ran out of memory #1567

Open jizhoul opened 1 year ago

jizhoul commented 1 year ago

Describe the bug

When running socketcan to send message for more than 48 hours, it eventually would crash with "Cannot allocate memory" error message

To Reproduce

from signal import signal, SIGINT
from sys import exit
import can
import serial
import time
import os
import subprocess
os.system('/usr/local/bin/startSMAGsUsbCan.sh can5')
#
# when board configured as socketcan, bring up link first:
# sudo ip link set can0 up type can bitrate 500000
#
#

def getSignedNumber(number, bitLength):
    mask = (2 ** bitLength) - 1
    if number & (1 << (bitLength - 1)):
        return number | ~mask
    else:
        return number & mask

def getbytes(integer):
    return divmod(integer, 0x100)

def handler(signal_received, frame):
    # Handle any cleanup here
    print('SIGINT or CTRL-C detected. Exiting gracefully')
    exit(0)

if __name__ == "__main__":
  # Tell Python to run the handler() function when SIGINT is recieved
  signal(SIGINT, handler)

  print('Running. Press CTRL-C to exit.')

  print ("Can bus init")
  canBusTTY = "/dev/ttyACM0"
  canBus = can.interface.Bus(bustype='socketcan', channel="can5", bitrate=500000, txqueuelen=10000)
  canBus.shutdown()
  print ("Can bus init done")

  while True:
    os.system('/usr/local/bin/startSMAGsUsbCan.sh can5')
    canBus = can.interface.Bus(bustype='socketcan', channel="can5", bitrate=500000, txqueuelen=10000)
    time.sleep(2)
    # Do nothing and hog CPU forever until SIGINT received.

    SoC_HD = 99.5
    Soc = int(SoC_HD)
    Req_Charge_HD = 100
    Req_Charge_A = 100.0
    Req_Discharge_HD = 30
    Req_Discharge_A = 200.0
    Max_V_HD = 56
    Max_V = 60.0
    Min_V = 46.0 

   #breakup some of the values for CAN packing
    SoC_HD = int(SoC_HD*100)
    SoC_HD_H, SoC_HD_L = getbytes(SoC_HD)
    Req_Charge_HD = int(Req_Charge_A*10)
    Req_Charge_H, Req_Charge_L = getbytes(Req_Charge_HD)
    Req_Discharge_HD = int(Req_Discharge_A*10)
    Req_Discharge_H, Req_Discharge_L = getbytes(Req_Discharge_HD)
    Max_V_HD = int(Max_V*10)
    Max_V_H, Max_V_L = getbytes(Max_V_HD)
    Min_V_HD = int(Min_V*10)
    Min_V_H, Min_V_L = getbytes(Min_V_HD)

    msg = can.Message(arbitration_id=0x351, 
      data=[Max_V_L, Max_V_H, Req_Charge_L, Req_Charge_H, Req_Discharge_L, Req_Discharge_H, Min_V_L, Min_V_H],
      is_extended_id=False)
    msg2 = can.Message(arbitration_id=0x355,
      data=[Soc, 0x00, 0x64, 0x0, SoC_HD_L, SoC_HD_H],
      is_extended_id=False)
    msg3 = can.Message(arbitration_id=0x356,
      data=[0x00, 0x00, 0x00, 0x0, 0xf0, 0x00],
      is_extended_id=False)
    msg4 = can.Message(arbitration_id=0x35a,
      data=[0x00, 0x00, 0x00, 0x0, 0x00, 0x00, 0x00, 0x00],
      is_extended_id=False)
    msg5 = can.Message(arbitration_id=0x35e,
      data=[0x42, 0x41, 0x54, 0x52, 0x49, 0x55, 0x4d, 0x20],
      is_extended_id=False)
    msg6 = can.Message(arbitration_id=0x35f,
      data=[0x03, 0x04, 0x0a, 0x04, 0x76, 0x02, 0x00, 0x00],
      is_extended_id=False)

    try:
        canBus.send(msg)
        time.sleep(.100)

        canBus.send(msg2)
        time.sleep(.100)

        canBus.send(msg3)
        time.sleep(.100)

        canBus.send(msg4)
        time.sleep(.100)

        canBus.send(msg5)
        time.sleep(.100)

        canBus.send(msg6)
        time.sleep(.100)

        print("Sent 6 frames on {}".format(canBus.channel_info))
        canBus.shutdown()

    except (can.CanError) as e:
      print("CAN BUS Transmit error (is controller missing?): %s" % e.message)

    pass

the shell script: /usr/local/bin/startSMAGsUsbCan.sh

#!/bin/sh

# called from udev ADD action to bring up the related link
#  
#
# $1 is the device (e.g., can10)
#
# these actions are only required prior to v2.90~18

if [ ! -f "/etc/venus/newUdevRules" ] ; then
    # bring the link up
    /sbin/ip link set $1 down
    /sbin/ip link set $1 up type can bitrate 500000
    /sbin/ip link set $1 txqueuelen 1000

    # restart services so they see this new interface
    svc -t /service/can-bus-bms.$1
    svc -t /service/dbus-systemcalc-py
    svc -t /service/dbus-adc
    svc -t /service/dbus-modbus-client
    svc -t /service/netmon
fi

Expected behavior

I expect the loop to keep running and sending messages

Additional context

OS and version: Venus OS v2.93 Python version: python3.8.13 python-can version: 4.1.0 python-can interface/s (if applicable):

Traceback and logs
zariiii9003 commented 1 year ago

Did you try taking tracemalloc snapshots? Maybe that could help finding the problem

AlexanderRavenheart commented 11 months ago

Hello @jizhoul ,

I don't think this is a bug in Socketcan.

The memory issues I noticed in your script:

Even if the data of a Message were to change, you can just assign new data bytes to the Message object instance, no need to discard the old one and create a completely new one each time the data changes.

In the refactored script I listed below I made the following memory improvements:

Other minor improvements:

Try this script:

from __future__ import annotations
from sys import exit
from can.interfaces.socketcan import SocketcanBus
from can import Message, CanError
from time import sleep
from subprocess import check_call, CalledProcessError

#
# when board configured as socketcan, bring up link first:
# sudo ip link set can0 up type can bitrate 500000
#
#

def get_signed_number(number: int, bit_length: int) -> int:
    mask = (2 ** bit_length) - 1
    if number & (1 << (bit_length - 1)):
        return number | ~mask
    else:
        return number & mask

def get_bytes(value: int) -> tuple[int, int]:
    return divmod(value, 0x100)

def prepare_messages() -> list[Message]:
    """Prepare the CAN messages"""
    SoC_HD = 99.5
    Soc = int(SoC_HD)
    Req_Charge_A = 100.0
    Req_Discharge_A = 200.0
    Max_V = 60.0
    Min_V = 46.0

    # breakup some of the values for CAN packing
    SoC_HD = int(SoC_HD * 100)
    SoC_HD_H, SoC_HD_L = get_bytes(SoC_HD)
    Req_Charge_HD = int(Req_Charge_A * 10)
    Req_Charge_H, Req_Charge_L = get_bytes(Req_Charge_HD)
    Req_Discharge_HD = int(Req_Discharge_A * 10)
    Req_Discharge_H, Req_Discharge_L = get_bytes(Req_Discharge_HD)
    Max_V_HD = int(Max_V * 10)
    Max_V_H, Max_V_L = get_bytes(Max_V_HD)
    Min_V_HD = int(Min_V * 10)
    Min_V_H, Min_V_L = get_bytes(Min_V_HD)

    return [
        Message(arbitration_id=0x351,
                data=[Max_V_L, Max_V_H, Req_Charge_L, Req_Charge_H, Req_Discharge_L, Req_Discharge_H, Min_V_L,
                      Min_V_H],
                is_extended_id=False),
        Message(arbitration_id=0x355,
                data=[Soc, 0x00, 0x64, 0x0, SoC_HD_L, SoC_HD_H],
                is_extended_id=False),
        Message(arbitration_id=0x356,
                data=[0x00, 0x00, 0x00, 0x0, 0xf0, 0x00],
                is_extended_id=False),
        Message(arbitration_id=0x35a,
                data=[0x00, 0x00, 0x00, 0x0, 0x00, 0x00, 0x00, 0x00],
                is_extended_id=False),
        Message(arbitration_id=0x35e,
                data=[0x42, 0x41, 0x54, 0x52, 0x49, 0x55, 0x4d, 0x20],
                is_extended_id=False),
        Message(arbitration_id=0x35f,
                data=[0x03, 0x04, 0x0a, 0x04, 0x76, 0x02, 0x00, 0x00],
                is_extended_id=False)
    ]

if __name__ == "__main__":
    # The bash script shouldn't really be called from the python script, especially if it requires sudo rights, since
    # that means the python script needs to be started with sudo, and that is pointless escalation of rights for
    # just sending some CAN messages repeatedly.
    print("Configuring can interface: can5")
    try:
        check_call(['/usr/local/bin/startSMAGsUsbCan.sh', 'can5'])
    except CalledProcessError as cpe:
        print("Failed to configure CAN interface, script exited with code: ", cpe.returncode)
        exit(cpe.returncode)

    print("Preparing messages...")
    messages = prepare_messages()
    count = len(messages)

    # Using with will ensure the bus is properly closed when it is no longer needed
    with SocketcanBus(channel="can5", bitrate=500000, txqueuelen=10000) as can_bus:
        display_message = f"Sent {count} frames on {can_bus.channel_info}"
        print('Running. Press CTRL-C to exit.')
        while True:
            try:
                print("Waiting for 2 seconds...")
                sleep(2)
                print("Sending messages...")
                for message in messages:
                    can_bus.send(message)
                    sleep(0.1)
                print(display_message)
            except KeyboardInterrupt:
                print('SIGINT or CTRL-C detected. Exiting gracefully')
                break
            except CanError as error:
                print(f"CAN BUS Transmit error (is controller missing?): {error}")
hartkopp commented 11 months ago

I also wonder why the txqueuelen is set to 10000 ?!? Usually the default is 10.

When the transmission rate of CAN frames is much higher than the available bandwidth the tx queue is filled up over time. But as @AlexanderRavenheart pointed out the the shell script: /usr/local/bin/startSMAGsUsbCan.sh is called every time in the loop. And shutting down the interface flushes the tx queue.