NSLS-II / ADKinetix

EPICS areaDetector driver for communicating with Teledyne Kinetix cameras
2 stars 3 forks source link

crashing when acquiring image #4

Closed LeeYangLBLBCS closed 3 weeks ago

LeeYangLBLBCS commented 1 month ago

I am having this issue while running on Ubuntu 22.0.4.
When I attempt to acquire an image, from a command line, IOC crashes
caput OML1:cam1:Acquire 1
Old : OML1:cam1:Acquire              Done
New : OML1:cam1:Acquire              Acquire
Unexpected problem with CA circuit to server "lyangUbuntu:5064" was "Connection reset by peer" - disconnecting
This is happening everytime.

the output of the IOC shell shown below:
====================================
2024/07/17 06:21:38.473 ADKinetix::writeFloat64: function=64, value=1.000000
2024/07/17 06:21:38.473 ADKinetix::writeFloat64: function=90, value=0.000000
2024/07/17 06:21:38.473 ADKinetix::writeFloat64: function=91, value=0.000000
2024/07/17 06:21:38.473 ADKinetix::writeFloat64: function=92, value=25.000000
2024/07/17 06:21:38.473 ADKinetix::writeInt32: function=102, value=0.000000
epicsThreadRealtimeLock Warning: unable to lock memory.  RLIMIT_MEMLOCK is too small or missing CAP_IPC_LOCK
iocRun: All initialization complete
# save things every thirty seconds
create_monitor_set("auto_settings.req", 30,"P=OML1:,R=image1:")
save_restore:readReqFile: unable to open file commonPlugin_settings.req. Exiting.
epics> 2024/07/17 06:21:38.961 ADKinetix::writeInt32: function=62, value=0.000000
auto_settings.sav: 55 of 55 PV's connected

epics> 
epics> 
epics> 2024/07/17 06:21:39.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:40.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:41.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:42.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:43.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:44.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:45.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:46.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:47.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:48.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:49.961 ADKinetix::writeInt32: function=62, value=0.000000
2024/07/17 06:21:50.455 ADKinetix::acquireStart: Spawning main acquisition thread...
2024/07/17 06:21:50.455 ADKinetix::writeInt32: function=8, value=0.000000
Floating point exception (core dumped)
![image](https://github.com/user-attachments/assets/c1a2413f-565c-4025-a88a-4eb64206a4ac)
LeeYangLBLBCS commented 1 month ago

here is the MEDM screen shot again: image

LeeYangLBLBCS commented 1 month ago

the previous version I reported the problem with was from a older version, 
commit 259526b9eeef906ae885fd11a9f48888edef2717
Merge: 144a8e4 6407bb9
Author: Jakub Wlodek <jwlodek@bnl.gov>
Date:   Thu Mar 28 10:06:22 2024 -0400

After I updated ADKinetix to the latest version, the IOC would not start anymore, with a core dump. 
Author: Jakub Wlodek <jwlodek@bnl.gov>
Date:   Wed Jun 26 17:20:55 2024 -0400

    Update README and version number, remove extraneous files
Output log below:
../../bin/linux-x86_64/kinetixApp st.cmd
#!../../bin/linux-x86_64/kinetixApp
#< /epics/common/xf31id1-lab3-ioc1-netsetup.cmd
errlogInit(20000)
< envPaths
epicsEnvSet("IOC","iocKinetix")
epicsEnvSet("TOP","/opt/epics/modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADKinetix/iocs/kinetixIOC")
epicsEnvSet("ADKINETIX","/opt/epics/modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADKinetix/iocs/kinetixIOC/../..")
epicsEnvSet("SUPPORT","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support")
epicsEnvSet("ASYN","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/asyn-R4-44-2")
epicsEnvSet("AREA_DETECTOR","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/areaDetector-R3-7")
epicsEnvSet("ADSUPPORT","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADSupport")
epicsEnvSet("ADCORE","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADCore")
epicsEnvSet("AUTOSAVE","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/autosave-R5-10")
epicsEnvSet("BUSY","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/busy-R1-7-2")
epicsEnvSet("CALC","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/calc-R3-7-4")
epicsEnvSet("SNCSEQ","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/seq-2-2-9")
epicsEnvSet("SSCAN","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/sscan-R2-11-3")
epicsEnvSet("DEVIOCSTATS","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/iocStats-3-1-16")
epicsEnvSet("STD","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/std-R3-6-1")
epicsEnvSet("EPICS_BASE","/opt/epics/base-7.0.4")
dbLoadDatabase("/opt/epics/modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADKinetix/iocs/kinetixIOC/dbd/kinetixApp.dbd")
kinetixApp_registerRecordDeviceDriver(pdbbase)
# Prefix for all records
# epicsEnvSet("PREFIX", "XF:31ID1-ES{Kinetix-Det:1}")
epicsEnvSet("PREFIX", "OML1:")
# The port name for the detector
epicsEnvSet("PORT",   "KTX")
# The queue size for all plugins
epicsEnvSet("QSIZE",  "20")
# The maximim image width; used for row profiles in the NDPluginStats plugin
epicsEnvSet("XSIZE",  "3200")
# The maximim image height; used for column profiles in the NDPluginStats plugin
epicsEnvSet("YSIZE",  "3200")
# The maximum number of time series points in the NDPluginStats plugin
epicsEnvSet("NCHANS", "2048")
# The maximum number of frames buffered in the NDPluginCircularBuff plugin
epicsEnvSet("CBUFFS", "500")
# The search path for database files
epicsEnvSet("EPICS_DB_INCLUDE_PATH", "/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADCore/db")
ADKinetixConfig(0, "KTX", 3200, 3200, 3, 0, 0)
Segmentation fault (core dumped)
jwlodek commented 1 month ago

I think I found the bug that causes the crashing on acquisition. In software trigger mode the vendor API doesn't allow for specifying an acquire period longer than the exposure. I tried to do this in software, but it doesn't seem to work, and in fact if the acquire period setting is longer than exposure time you see the crash. I will try and resolve it.

jwlodek commented 1 month ago

If it is seg-faulting on startup, my guess is it is not detecting the camera - when this happens, could you check the output of the ListCameras example from PVCam?

LeeYangLBLBCS commented 1 month ago

It only seg-faults with the latest ADKinetix version, not the older version, as I stated previously.
I tried a few pvcam's example programs, e.g. the following, which worked fine.
=====================================================
/opt/pvcam/sdk/examples/code_samples/bin/linux-x86_64/release$ ./FanSpeedAndTemperature 
************************************************************
Application  : ./FanSpeedAndTemperature
App. version : 1.7.84
PVCAM version: 3.10.0
************************************************************

PVCAM initialized
Number of cameras found: 1
Camera 0 name: 'pvcamUSB_0'

Camera 0 'pvcamUSB_0' opened
  Device driver version: 7.0.1
  Sensor chip name: TMP-Kinetix
  Camera firmware version: 30.45
  Sensor size: 3200x3200 px

  Speed table:
  - port 'Sensitivity', value 0
    - speed index 0, name: 'Standard'
      - gain index 1, 'Standard', bit-depth 12 bpp
  - port 'Speed', value 1
    - speed index 0, name: 'Standard'
      - gain index 1, 'Standard', bit-depth 8 bpp
  - port 'Dynamic Range', value 2
    - speed index 0, name: 'Standard'
      - gain index 1, 'Standard', bit-depth 16 bpp
  - port 'Sub-Electron', value 3
    - speed index 0, name: 'Standard'
      - gain index 1, 'Standard', bit-depth 16 bpp

  Readout port set to 'Sensitivity'
  Readout speed index set to 0
  Gain index set to 1

  Camera without Frame Transfer capability sensor
  Smart Streaming is available

Press <Enter> to show main menu

- Current sensor temperature on camera 0 is   +0.00 C^C
>>>
>>> CLI TERMINATION HANDLER
>>> Aborting user input
>>>
jwlodek commented 1 month ago

I believe I have resolved the segfault on connect in the main branch. Please give it a try

LeeYangLBLBCS commented 1 month ago

I got the latest the on the main branch. But I still got the same seg-fault.
OEM's PVCamTest program works fine.
==========================================
Merge: c968e15 d83a53e
Author: Jakub Wlodek <jwlodek@bnl.gov>
Date:   Fri Jul 26 13:13:55 2024 -0400

    Merge pull request #6 from jwlodek/add-clang-format

    Add clang format file
===============IOC start up output==================
/opt/epics/modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADKinetix/iocs/kinetixIOC/iocBoot/iocKinetix$ ../../bin/linux-x86_64/kinetixApp st.cmd
#!../../bin/linux-x86_64/kinetixApp
< /epics/common/xf31id1-lab3-ioc1-netsetup.cmd
Can't open /epics/common/xf31id1-lab3-ioc1-netsetup.cmd: No such file or directory
errlogInit(20000)
< envPaths
epicsEnvSet("IOC","iocKinetix")
epicsEnvSet("TOP","/opt/epics/modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADKinetix/iocs/kinetixIOC")
epicsEnvSet("ADKINETIX","/opt/epics/modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADKinetix/iocs/kinetixIOC/../..")
epicsEnvSet("SUPPORT","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support")
epicsEnvSet("ASYN","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/asyn-R4-44-2")
epicsEnvSet("AREA_DETECTOR","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/areaDetector-R3-7")
epicsEnvSet("ADSUPPORT","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADSupport")
epicsEnvSet("ADCORE","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADCore")
epicsEnvSet("AUTOSAVE","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/autosave-R5-10")
epicsEnvSet("BUSY","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/busy-R1-7-2")
epicsEnvSet("CALC","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/calc-R3-7-4")
epicsEnvSet("SNCSEQ","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/seq-2-2-9")
epicsEnvSet("SSCAN","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/sscan-R2-11-3")
epicsEnvSet("DEVIOCSTATS","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/iocStats-3-1-16")
epicsEnvSet("STD","/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/std-R3-6-1")
epicsEnvSet("EPICS_BASE","/opt/epics/base-7.0.4")
dbLoadDatabase("/opt/epics/modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADKinetix/iocs/kinetixIOC/dbd/kinetixApp.dbd")
kinetixApp_registerRecordDeviceDriver(pdbbase)
# Prefix for all records
epicsEnvSet("PREFIX", "XF:31ID1-ES{Kinetix-Det:1}")
# The port name for the detector
epicsEnvSet("PORT",   "KTX")
# The queue size for all plugins
epicsEnvSet("QSIZE",  "20")
# The maximim image width; used for row profiles in the NDPluginStats plugin
epicsEnvSet("XSIZE",  "3200")
# The maximim image height; used for column profiles in the NDPluginStats plugin
epicsEnvSet("YSIZE",  "3200")
# The maximum number of time series points in the NDPluginStats plugin
epicsEnvSet("NCHANS", "2048")
# The maximum number of frames buffered in the NDPluginCircularBuff plugin
epicsEnvSet("CBUFFS", "500")
# The search path for database files
epicsEnvSet("EPICS_DB_INCLUDE_PATH", "/opt/epics/base-7.0.4/../modules/synApps_6_1_epics7/support/areaDetector-R3-7/ADCore/db")
ADKinetixConfig(0, "KTX", 3200, 3200, 3, 0, 0)
Segmentation fault (core dumped)
jwlodek commented 1 month ago

Do you think we could get on a zoom call at some point for me to have a look? I have three Kinetix cameras running now but cannot reproduce the problem.

LeeYangLBLBCS commented 1 month ago

sure. any time. I'm available all day today, most of the time this week.

LeeYangLBLBCS commented 1 month ago

Here is some information I captured using gdb. The seg fault appears to be happening inside the sdk function call on this line in ADKinetix.cpp file: if (!pl_cam_open(this->cameraContext->camName, &this->cameraContext->hcam, ....

I suspect this issue is caused by the computer being too old.

jwlodek commented 3 weeks ago

Working in-person at ALS, this was resolved. We are not sure how - it seems there was a bug in the vendor pvcam library. We added some debug printing statements, and eventually things started working, even post reboot. The main issue seemed to be that the "cameraName" string was being returned as some corrupted binary in the IOC but not the vendor software. It is unclear why this is, since the code is exactly the same. After some restarts, the issue resolved by itself, and the fix persisted even after all changes were reverted.

I suggest we close this since it does not appear to be an issue with the IOC code, and revisit if we can reproduce it again.