analogdevicesinc / libiio

A cross platform library for interfacing with local and remote Linux IIO devices
http://analogdevicesinc.github.io/libiio/
GNU Lesser General Public License v2.1
495 stars 318 forks source link

Segfault when passing a DAC channel to a function as argument #1168

Closed ilario closed 1 week ago

ilario commented 6 months ago

I am controlling an AD5370 evaluation board from Python 3.9.2 using IIO from an Olimex A64-OLinuXino-2Ge8G-IND single board computer running Debian Bullseye 11 (oldstable, aarch64) using its SPI port (exposed in its UEXT connector). I compiled the kernel (Olimex ships Debian with kernel 5.10.180, so I compiled that one with their patches) to include the ad5360 module.

The board works perfectly when I write to the file interface (writing to the files in /sys/bus/iio/devices) or when I control it from Python.

The problem arises when I pass the channel object from a function to another, or I create it in the init of a class and I try to access it from one of its methods.

Here you are a minimal non-working example:

a64-olinuxino-vitsolc1 :: ~/ » cat iiodac.py
import iio

def get_chan():
    ctx = iio.Context()
    ctrl = ctx.find_device("ad5370")
    chan = ctrl.find_channel("voltage19", is_output=True)
    print("Here it works")
    print(chan.attrs['scale'].value)
    return chan

def get_scale(chan):
    print("Here it does not work")
    print(chan.attrs['scale'].value)

And here you have its output, with the content of the coredump:

a64-olinuxino-vitsolc1 :: ~/ » python3
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from iiodac import *
>>> chan = get_chan()
Here it works
0.183105468
>>> get_scale(chan)
Here it does not work
[1]    2376 segmentation fault (core dumped)  python3
a64-olinuxino-vitsolc1 :: ~/ 139 » coredumpctl info
           PID: 2376 (python3)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Thu 2024-05-16 15:31:34 CEST (7s ago)
  Command Line: python3
    Executable: /usr/bin/python3.9
 Control Group: /user.slice/user-0.slice/session-1.scope
          Unit: session-1.scope
         Slice: user-0.slice
       Session: 1
     Owner UID: 0 (root)
       Boot ID: 0afe60322846419c9466f5f3a513c603
    Machine ID: db2a6f490e9b4af08a38659ba38b29e0
      Hostname: a64-olinuxino-vitsolc1
       Storage: /var/lib/systemd/coredump/core.python3.0.0afe60322846419c9466f5f3a513c603.2376.1715866294000000.zst
       Message: Process 2376 (python3) of user 0 dumped core.

                Stack trace of thread 2376:
                #0  0x0000ffff9321b944 iio_channel_attr_read (libiio.so.0 + 0x7944)
                #1  0x0000ffff933c4048 ffi_call_SYSV (libffi.so.7 + 0x6048)
                #2  0x0000ffff933c3770 ffi_call_int (libffi.so.7 + 0x5770)
                #3  0x0000ffff933e8b68 n/a (_ctypes.cpython-39-aarch64-linux-gnu.so + 0x11b68)
                #4  0x0000ffff933e7c6c n/a (_ctypes.cpython-39-aarch64-linux-gnu.so + 0x10c6c)
                #5  0x00000000004a5300 _PyObject_MakeTpCall (python3.9 + 0xa5300)
                #6  0x000000000049c568 _PyEval_EvalFrameDefault (python3.9 + 0x9c568)
                #7  0x00000000004b1a48 _PyFunction_Vectorcall (python3.9 + 0xb1a48)
                #8  0x0000000000498218 _PyEval_EvalFrameDefault (python3.9 + 0x98218)
                #9  0x00000000004b1a48 _PyFunction_Vectorcall (python3.9 + 0xb1a48)
                #10 0x00000000004c3564 n/a (python3.9 + 0xc3564)
                #11 0x00000000004af754 _PyObject_GenericGetAttrWithDict (python3.9 + 0xaf754)
                #12 0x0000000000497fa8 _PyEval_EvalFrameDefault (python3.9 + 0x97fa8)
                #13 0x00000000004b1a48 _PyFunction_Vectorcall (python3.9 + 0xb1a48)
                #14 0x0000000000498064 _PyEval_EvalFrameDefault (python3.9 + 0x98064)
                #15 0x00000000004964f8 n/a (python3.9 + 0x964f8)
                #16 0x0000000000496290 _PyEval_EvalCodeWithName (python3.9 + 0x96290)
                #17 0x00000000005976fc PyEval_EvalCode (python3.9 + 0x1976fc)
                #18 0x00000000005c850c n/a (python3.9 + 0x1c850c)
                #19 0x00000000005c2520 n/a (python3.9 + 0x1c2520)
                #20 0x0000000000421108 n/a (python3.9 + 0x21108)
                #21 0x0000000000420da8 PyRun_InteractiveLoopFlags (python3.9 + 0x20da8)
                #22 0x00000000005c7820 PyRun_AnyFileExFlags (python3.9 + 0x1c7820)
                #23 0x00000000005b7f1c Py_RunMain (python3.9 + 0x1b7f1c)
                #24 0x0000000000587638 Py_BytesMain (python3.9 + 0x187638)
                #25 0x0000ffff93ab3e18 __libc_start_main (libc.so.6 + 0x20e18)
                #26 0x0000000000587534 _start (python3.9 + 0x187534)
                #27 0x0000000000587534 _start (python3.9 + 0x187534)

Testing different environments I also observed bus_error instead of segfault.

The work around I found is to pass from function to function only the context manager, then find the device and find the channel and set the calibration values every time I want to operate the DAC.

This is the example with the ugly fix and its output:

a64-olinuxino-vitsolc1 :: ~/pvbox2 » cat iiodac2.py
import iio

def get_ctx():
    ctx = iio.Context()
    ctrl = ctx.find_device("ad5370")
    chan = ctrl.find_channel("voltage19", is_output=True)
    print("Here it works")
    print(chan.attrs['scale'].value)
    return ctx

def get_scale(ctx):
    ctrl = ctx.find_device("ad5370")
    chan = ctrl.find_channel("voltage19", is_output=True)
    print("Also here it works")
    print(chan.attrs['scale'].value)

a64-olinuxino-vitsolc1 :: ~/pvbox2 » python3
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from iiodac2 import *
>>> ctx = get_ctx()
Here it works
0.183105468
>>> get_scale(ctx)
Also here it works
0.183105468
tfcollins commented 6 months ago

Can you provide details on the commits/versions of the python bindings and c library you are using?

ilario commented 6 months ago

Can you provide details on the commits/versions of the python bindings and c library you are using?

I am not sure I got the versions right... Here you are:

$ apt show libpython3.9-stdlib
Package: libpython3.9-stdlib
Version: 3.9.2-1

Could you tell me the name of the relevant packages? Thanks!

tfcollins commented 6 months ago

I need to know the libiio version/commit including its python bindings. If you don't know, how did you install both?

ilario commented 6 months ago

Ok, thanks for the patience. Here it is, I hope:

$ apt show libiio0
Package: libiio0
Version: 0.21-2+b1

$ apt show python3-libiio
Package: python3-libiio
Version: 0.21-2
tfcollins commented 6 months ago

My guess is you are running into some weak ref issues we had in python years back. For channel attributes it was related to this fix https://github.com/analogdevicesinc/libiio/commit/f2ebf4b3fe9c96dbc3721552765b319dac53ba99

I would upgrade to v0.25 for both the library and bindings

dNechita commented 1 week ago

@ilario are you still having this issue?

ilario commented 1 week ago

Sorry for the silence. It is quite complicated for me to test the v0.25 as the provider (Olimex) of the single board computer I am using (A64-OLinuXino) is providing only Debian oldstable Bullseye (which includes only v0.21), until now. I am going to close the issue, and will report back when I will be able to test.