alliedvision / VimbaPython

Old Allied Vision Vimba Python API. The successor to this API is VmbPy
BSD 2-Clause "Simplified" License
93 stars 40 forks source link

Way to detach camera driver after Crash in Linux (Ubuntu)? #11

Open beniroquai opened 4 years ago

beniroquai commented 4 years ago

Is there a way to reset the camera driver after a script crashes the camera pipeline (i.e. multithreading.py)? Currently I have to restart the camera, but I would appreciate to do a rmmod or modprob?

Any help much appreciated!

NiklasKroeger-AlliedVision commented 4 years ago

I am not entirely sure I understand what problem you are facing. Are you unable to connect to the camera after your program crashes because it is still in use by your crashed program? Or are you able to access the camera but it does not behave as you would expect?

To get a better understanding of the problem some more information would be helpful. Could you tell me:

beniroquai commented 4 years ago

The current setup is:

Nvidia Jetson Nano on Jetpack (ubuntu LTS)
Camera  ALVIUM 1800 U-500 with USB3
Camera is accessible through VimbaViewer

Sometimes it happens, that the camera inside a asynchronous routine crashes - due to some unexpected exception somewhere in the code. Restarting the program does not detect any cameras anymore. I was guessing, that this is due to a not properly detected device driver from the active camera. This behaviour is reproducible in the VimbaViewer. Maybe detaching the driver inside Linux helps somehow? Do you have any experiences with this?

Thanks

NiklasKroeger-AlliedVision commented 4 years ago

Thanks for the information.

If the camera is accessible through the VimbaViewer this means, that our Transport Layer is detecting it. So communication should generally be possible. This does not seem like an issue with the actual driver to me. Maybe the camera is still "locked" because the connection was not closed properly in the exception case... Generally some information relevant for this is stored in a shared memory region Vimba uses to synchronize access from multiple threads on Linux. This is stored in /dev/shm/[some-hash-value-generally-starting-with-a-3]. You can safely delete this as the Transport Layer will create it again if the file does not exist. By doing so you are essentially "resetting" the connection states that are stored here. Our official application note on this says that in some cases rebooting the computer might be necessary (section Releasing USB cameras when Vimba crashed on page 3), but I assume removing this file might be enough in this case. Hopefully this will make the camera accessible again. We are currently experimenting with getting rid of this shared memory system to prevent problems like these but that is not yet released.

You mention a crash in an asynchronous routine. Another idea that comes to mind is that perhaps the camera is in some unexpected state where it is still acquiring images and therefore not behaving as you would expect on a fresh start. I have seen some cases where the AcquisitionStop command is not executed on the camera before closing the connection. Depending on your configuration I believe the indicator LED on the back of the Alvium USB Cameras should be blinking while the camera is acquiring images. This might be an indicator to see if the acquisition is still running.

If your problems persist please feel free to also contact our support team via the contact form on our website. They might have further ideas on how to troubleshoot your problem and be able to help you better.

I hope this helps you. Best Regards, -Niklas.

beniroquai commented 3 years ago

Hello, I'm facing a new issue which goes into the same direction. Situation: Alvium USB3 U-158m (no case, 90 degree tilted USB mount) on a Jetson Nano.

The camera is correclty attached to one of the USB3 ports and gives:

bene@bene-desktop:~/OFM/openflexure-microscope-server/openflexure_microscope$ dmesg | tail
[   11.054663] wlan0: associate with 3c:37:86:81:33:e5 (try 1/3)
[   11.060349] wlan0: RX AssocResp from 3c:37:86:81:33:e5 (capab=0x1411 status=0 aid=3)
[   11.061538] wlan0: associated
[   11.245513] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[ 3692.680191] usb 2-1.2: new SuperSpeed USB device number 3 using tegra-xusb
[ 3692.702260] usb 2-1.2: New USB device found, idVendor=1ab2, idProduct=0001
[ 3692.702333] usb 2-1.2: New USB device strings: Mfr=2, Product=3, SerialNumber=4
[ 3692.702382] usb 2-1.2: Product: ALVIUM 1800 U-158m
[ 3692.702429] usb 2-1.2: Manufacturer: Allied Vision
[ 3692.702470] usb 2-1.2: SerialNumber: 00XFW

After some time using the Asynchronous Frame Grab Example, the camera disconnects and nothing is available anymore. I don't have this problem with another camera (same model but with a housing + oridnary USB3 mount) on a different Jetson. Is there a way to find out what's going on here? Or: Is there a way to do a software-based disconnect/reconnect in order to get the camera working again?

When I call

            with vimba:
                # Construct FrameProducer threads for the detected camera
                cams = vimba.get_all_cameras()

The return is an emtpy array (). Still, when I perform dmesg | tail the output still shows the same output as above. How the camera is not visible to python?

Thanks a lot!

beniroquai commented 3 years ago

Update:

Running the file camera/vicamera_listcameras.py gives this:

//////////////////////////////////////
/// Vimba API List Cameras Example ///
//////////////////////////////////////

Cameras found: 0

Physically detaching and reconnecting the device helps gives this:

//////////////////////////////////////
/// Vimba API List Cameras Example ///
//////////////////////////////////////

Cameras found: 1
/// Camera Name   : Allied Vision 1800 U-158m
/// Model Name    : 1800 U-158m
/// Camera ID     : DEV_1AB22C00A94C
/// Serial Number : 00XFW
/// Interface ID  : VimbaUSBInterface_0x0

Interesting, the lsusb command shows a difference when the camera deconnects from vimba:


bene@bene-desktop:~$ lsusb
Bus 002 Device 004: ID 1ab2:0001
Bus 002 Device 002: ID 0bda:0411 Realtek Semiconductor Corp.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 8087:0a2b Intel Corp.
Bus 001 Device 004: ID 2341:0043 Arduino SA Uno R3 (CDC ACM)
Bus 001 Device 002: ID 0bda:5411 Realtek Semiconductor Corp.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

and later

bene@bene-desktop:~$ lsusb
Bus 002 Device 002: ID 0bda:0411 Realtek Semiconductor Corp.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 8087:0a2b Intel Corp.
Bus 001 Device 004: ID 2341:0043 Arduino SA Uno R3 (CDC ACM)
Bus 001 Device 002: ID 0bda:5411 Realtek Semiconductor Corp.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
beniroquai commented 3 years ago

Perhaps it's a bandwidth limit? See post

beniroquai commented 3 years ago

It seems switching off autosuspendsolved the problem. See wiki

beniroquai commented 3 years ago

Unfortunately, that was not the issue..

beniroquai commented 3 years ago

Also adding an active USB3 hub didn't solve the problem:

It works here:

bene@bene-desktop:~/OFM/openflexure-microscope-server/openflexure_microscope/camera$ python vicamera_listcameras.py
//////////////////////////////////////
/// Vimba API List Cameras Example ///
//////////////////////////////////////

Cameras found: 1
/// Camera Name   : Allied Vision 1800 U-158m
/// Model Name    : 1800 U-158m
/// Camera ID     : DEV_1AB22C00A94C
/// Serial Number : 00XFW
/// Interface ID  : VimbaUSBInterface_0x0

Then directly afterwards:

bene@bene-desktop:~/OFM/openflexure-microscope-server/openflexure_microscope/camera$ python vicamera_asynchronous.py
^C^C^CTraceback (most recent call last):
  File "vicamera_asynchronous.py", line 29, in <module>
    import cv2
  File "/usr/local/lib/python3.7/dist-packages/cv2/__init__.py", line 96, in <module>
^C  File "/usr/local/lib/python3.7/dist-packages/cv2/__init__.py", line 86, in bootstrap
    import cv2
KeyboardInterrupt`

bene@bene-desktop:~/OFM/openflexure-microscope-server/openflexure_microscope/camera$ export DISPLAY=:0
bene@bene-desktop:~/OFM/openflexure-microscope-server/openflexure_microscope/camera$ python vicamera_asynchronous.py
///////////////////////////////////////////////////////
/// Vimba API Asynchronous Grab with OpenCV Example ///
///////////////////////////////////////////////////////

No Cameras accessible. Abort.

bene@bene-desktop:~/OFM/openflexure-microscope-server/openflexure_microscope/camera$ python vicamera_listcameras.py
//////////////////////////////////////
/// Vimba API List Cameras Example ///
//////////////////////////////////////

Cameras found: 0

The output of dmesg remains the same before/after

bene@bene-desktop:~/OFM/openflexure-microscope-server/openflexure_microscope/camera$ dmesg | tail
[ 9498.824415] usb 2-1.4.2: New USB device strings: Mfr=2, Product=3, SerialNumber=4
[ 9498.824439] usb 2-1.4.2: Product: ALVIUM 1800 U-158m
[ 9498.824461] usb 2-1.4.2: Manufacturer: Allied Vision
[ 9498.824482] usb 2-1.4.2: SerialNumber: 00XFW
[ 9498.978004] usb 1-2.4.1: new full-speed USB device number 10 using tegra-xusb
[ 9499.007580] usb 1-2.4.1: New USB device found, idVendor=2341, idProduct=0043
[ 9499.007601] usb 1-2.4.1: New USB device strings: Mfr=1, Product=2, SerialNumber=220
[ 9499.007614] usb 1-2.4.1: Manufacturer: Arduino (www.arduino.cc)
[ 9499.007625] usb 1-2.4.1: SerialNumber: 550373132373512141E2
[ 9499.017607] cdc_acm 1-2.4.1:1.0: ttyACM1: USB ACM device

but in the lsusb the camera is gone.

NiklasKroeger-AlliedVision commented 3 years ago

This feels like it is a more general USB/camera problem. I have forwarded a link to this Github issue to our support team so they can take a look at it. I am afraid right now I do not have many ideas. what might happen...

Perhaps it's a bandwidth limit?

I do not believe that a bandwidth limitation should cause the camera to literally disappear from the list of devices without leaving a trace in dmesg... bandwidth limitations would usually rather trigger issues like large numbers of incomplete frames. But I might be mistaken. Do you have other USB devices connected to the system?

Physically detaching and reconnecting the device helps gives this:

This together with the fact that lsusb does not recognize the device anymore makes it seem like there is some problem with the USB system on the device. I am not sure if the problem lies on the camera side or on the nano side. Is it possible some udev rules got messed up? Maybe try uninstalling the VimbaUSBTL, rebooting the board, Installing the VimbaUSBTL again and rebooting one more time. That way the udev rules should be reset in case they were modified for some reason. Then again if the camera disappearing actually triggered any udev rules I would expect that to leave some kind of trace in the dmesg output which does not seem to be the case, so this is more a shot in the dark....

In what cases is this event triggered? You mentioned, that

After some time using the Asynchronous Frame Grab Example, the camera disconnects and nothing is available anymore.

Is there any kind of error message? Are you no longer receiving any frames or are the frames you receive incomplete? What does the activity LED on the back of the camera do?

I hope that our support team can give some more targeted comments on this issue and help resolve this. Like I said, this does not feel like a VimbaPython issue to me but might instead be something lower-level. Perhaps even related to the used hardware? Maybe the USB connection comes loose as the camera warms up from the streaming process?

beniroquai commented 3 years ago

Thank you very much for the detailed explanation. I tried to swap multiple components, including cameras, cables and Jetson Nano's. For the situation mentioned above I used a "standard" USB3 cable (i.e. non Allied vision one). This does not work well with the non-housed cameras. I went to a Windows Laptop which showed the same issue. The housed camera works nicely with the same cable (on both, Jetson + Windows Laptop). The shielded (?) USB3 cable from Allied vision works on with the bare board Alvium camera. So in the end it was a hardware problem. I hope that was not mentioned somewhere..it took quite some time to figure that out ;-)