areaDetector / ADProsilica

An EPICS areaDetector driver for Gigabit Ethernet and USB cameras from Allied Vision Technologies, who purchased Prosilica. The driver is supported under Windows, Linux and Mac OS X using the old pvAPI vendor library provided for those operating systems.
https://areadetector.github.io/areaDetector/ADProsilica/ADProsilica.html
1 stars 17 forks source link

Driver gets stuck when it receives bad frames #18

Open brunoseivam opened 8 years ago

brunoseivam commented 8 years ago

When in Single or Multiple mode, the driver sets framesRemaining to 1 or numImages, respectively. However, when it receives a bad frame, it won't decrement framesRemaining nor reissue new triggers, which will leave the driver stuck in the Acquire state.

Although ideally the driver shouldn't be receiving bad frames, I think it shouldn't get stuck when it does. However, I don't know how to properly address that. Should it fail and return an error? Should it reissue the acquisition for the frames that came in bad? What about hardware triggers?

MarkRivers commented 8 years ago

Is the frameCallback function actually being called for the bad frames? If so we could change the behavior to at least stop acquisition when the correct number of frames, good or bad, have been received.

brunoseivam commented 8 years ago

Yes, it is for most of the time, although I found some instances where it is not even being called. The rate of bad frames correlate with the CPU usage by another IOC, so I guess the prosilica thread might be getting starved of CPU time and can't keep up with the data rate?

The machine has 12 cores, one IOC is consuming ~350% and the prosilica IOC is consuming ~100%, so I wouldn't expect it to be an issue.

I will try pinning the IOC to a set of CPUs and see if that helps.

I tried setting GvspResendPercent to 100%, but it didn't seem to help much.

mp49 commented 8 years ago

If one thread in the prosillica IOC is using all or most of that 100%, that might be the problem. Having idle cores won't help in that case, if one thread is maxing out a core.

brunoseivam commented 8 years ago

cam07 is the one driving the CPU usage high. cam03 is the one giving me grief, even though none of its threads is getting to 100%. CPU pinning didn't help. Does the PvAPI library use only one thread to handle all requests from different IOCs?

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND                                                           
 4096 cam07     20   0 2553m 410m 5932 R  96.6  1.3   2926:56 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
 4058 cam07     20   0 2553m 410m 5932 R  94.0  1.3   2970:24 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
 4030 cam07     20   0 2553m 410m 5932 R  70.6  1.3   7769:35 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
26614 cam03     20   0  529m  83m 5172 R  55.1  0.3  10:51.27 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam03/st.cmd  
 4463 cam07     20   0 2553m 410m 5932 R  28.9  1.3   2390:17 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
26573 cam03     20   0  529m  83m 5172 S  20.5  0.3   3:37.64 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam03/st.cmd  
26606 cam03     20   0  529m  83m 5172 R  18.8  0.3   3:40.06 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam03/st.cmd  
22796 cam07     20   0 2553m 410m 5932 S  15.3  1.3  10:19.45 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
26594 cam03     20   0  529m  83m 5172 S  12.2  0.3   2:35.21 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam03/st.cmd  
 4136 cam07     20   0 2553m 410m 5932 S  11.0  1.3   1595:23 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
 4032 cam07     20   0 2553m 410m 5932 R  10.0  1.3   1279:10 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
 4028 cam07     20   0 2553m 410m 5932 S   7.2  1.3 881:15.71 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
26628 cam03     20   0  529m  83m 5172 S   7.2  0.3   1:32.90 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam03/st.cmd  
19505 cam07     20   0 2553m 410m 5932 S   4.8  1.3  18:21.02 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
 4151 cam07     20   0 2553m 410m 5932 S   4.3  1.3 588:21.18 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
 4155 cam07     20   0 2553m 410m 5932 S   4.3  1.3 649:44.59 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
26659 cam03     20   0  529m  83m 5172 S   3.8  0.3   3:23.81 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam03/st.cmd  
 4034 cam07     20   0 2553m 410m 5932 S   1.7  1.3 212:36.21 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
 4159 cam07     20   0 2553m 410m 5932 S   0.7  1.3  40:17.57 ../prosilica/bin/linux-x86_64/prosilica /epics/iocs/cam07/st.cmd  
MarkRivers commented 8 years ago

Have you applied the system changes discussed in this tech-talk thread?

http://www.aps.anl.gov/epics/tech-talk/2013/msg00787.php

It involves increasing net.core.rmem_default and net.core.rmem_max.

brunoseivam commented 8 years ago

That did the trick! No bad frames anymore. Thanks!

Although if the driver does perchance receive one when in Single or Multiple mode it will still get stuck :)

MarkRivers commented 8 years ago

Note that the link to the Point Grey Knowledge Base article in my old tech-talk thread no longer works. However, this link does work:

http://www.ptgrey.com/KB/10016

MarkRivers commented 6 years ago

That did the trick! No bad frames anymore. Thanks! Although if the driver does perchance receive one when in Single or Multiple mode it will still get stuck :)

We just re-discovered this issue with cameras at NSLS-II. It is not clear how to fix it, as Bruno said above. Questions: