areaDetector / ADSpinnaker

An EPICS areaDetector driver for cameras from FLIR (formerly Point Grey) using their Spinnaker SDK.
https://areadetector.github.io/areaDetector/ADSpinnaker/ADSpinnaker.html
5 stars 6 forks source link

memory leak using ADSpinnaker and SoftwareTrigger #7

Closed rettigl closed 3 years ago

rettigl commented 3 years ago

Hi,

I'm using a FLIR Blackfly BFLY-PGE-50A2C-CS GigE camera with areadetector and ADSpinnaker on linux Ubuntu 18.04 and epics 7 and quite recent github versions of support modules, which mostly works fine. However, I am experiencing the following strange behavior:

Because this camera cannot set the frame rate independently of the exposure time, I have built a work around using its software trigger, with the idea to set it to software trigger and use a bo soft channel with periodic scan to write to the "TriggerSoftware" PV of the camera (I only need low frame rates for monitoring, but small exposure times). This functionality all works well, however, with the problem that whenever I have the bo record repeatedly trigger the "TriggerSoftware" PV, the memory consumption of the ioc process starts to continously increase with a rate of around 7 mByte/hour at 1s Scan interval (the rate scales with the trigger frequency, the example below has a trigger frequency of 0.5 seconds). This does not happen if I stop the scanning of the softchannel, but it is independent of whether the camera is actually triggered or not "TriggerMode=Off or On". No plugins apart from an Image plugin are loaded. When I stop the acquisition of the camera, the memory consumption drops to a baseline and stays there constant, but if I resume acquisition later, it resumes at a higher memory consumtion, as if it was running all the time:

w_memory_consumption

The definition of the softchannel PV is the following:

record(bo,"$(P)trigger") { field(DESC,"Camera trigger") field(VAL,1) field(OUT,"$(P)cam1:TriggerSoftware PP MS") field(SCAN,"1 second") }

Is this a bug in ADSpinnaker or the Spinnaker SDK, or am I doing something silly here?

Best, Laurenz

MarkRivers commented 3 years ago

Because this camera cannot set the frame rate independently of the exposure time, I have built a work around ...

Are you sure about that? Can you post screen shots of the main ADSpinnaker screen and the camera-specific screens?

the memory consumption of the ioc process starts to continously increase with a rate of around 7 mByte/hour at 1s Scan interval

That works out to about 2 KB per image, which is quite small.

What happens if you use a normal internal trigger and acquire at 10 frames/s. Does the memory usage also increase at 2 KB per image?

MarkRivers commented 3 years ago

I just ran ADSpinnaker on Ubuntu 18 with an Oryx 51S5M camera.

What command did you run to get the memory usage you plotted above? With "top" I see this:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
30904 epics     20   0   55316   5024   3544 R   1.0  0.0   0:03.90 top
29077 epics     20   0 7776280 726956  42916 S   0.7  1.5   0:48.55 spinnakerApp

So I am seeing the spinnakerApp using about 7.8 GB of virtual memory, much more than the 95 MB you are seeing. I clicked repeatedly on the TriggerSoftware button and watched the Virtual Memory in "top". It moved up and down a bit, but I did not see any upward trend.

rettigl commented 3 years ago

Hi Mark, thanks for your questions.

Because this camera cannot set the frame rate independently of the exposure time, I have built a work around ...

Are you sure about that? Can you post screen shots of the main ADSpinnaker screen and the camera-specific screens?

The options to do that are there, but the camera does not want to do that. ADSpinnaker_Screen This is the main screen of the ADSpinnaker for my camera. If I set ExposureAuto to Off, and Frame Rate Enable to On, and then set a FrameRate, I get an error in the Epics shell: epics> 2021/05/19 08:55:05.510 Param[GC_FRAMERATE] GenICamFeature::write: node AcquisitionFrameRate is not writable I also cannot change the frame rate in the independently in the FLIR software, so I think it is a limitation of the camera.

That works out to about 2 KB per image, which is quite small.

I think it is not related to the actual size of the image, because it does not change if I change the binning or region size, as far as I can tell.

What happens if you use a normal internal trigger and acquire at 10 frames/s. Does the memory usage also increase at 2 KB per image?

Do you just mean free running? That's what I did in the example above. Actually, in all the example above the Trigger was actually off, and the camera was free running at 130 Hz (with a small region size as you can see in the screen shot). If I do not process the SoftwareTrigger PV, the memory stays ~constant.

What command did you run to get the memory usage you plotted above? With "top" I see this:

I used a profiling bash snipped based on ps, which records the "resident set size", i.e. the non-swapped memory. This corresponds to the RES column in top : logpid() { while sleep 60; do ps -p $1 -o pcpu= -o rss= ; done; }

The virtual memory actually does not increase, it appears.

So I am seeing the spinnakerApp using about 7.8 GB of virtual memory, much more than the 95 MB you are seeing. I clicked repeatedly on the TriggerSoftware button and watched the Virtual Memory in "top". It moved up and down a bit, but I did not see any upward trend.

I am using a fairly small image size (because of a small region size), maybe that makes a difference. The relevant parts of the ioc startup file are these:

# Really large queue so we can stream to disk at full camera speed
epicsEnvSet("QSIZE",  "2000")
# The maximim image width; used for row profiles in the NDPluginStats plugin
epicsEnvSet("XSIZE",  "2592")
# The maximim image height; used for column profiles in the NDPluginStats plugin
epicsEnvSet("YSIZE",  "1944")
# The maximum number of time series points in the NDPluginStats plugin
epicsEnvSet("NCHANS", "2048")
# The maximum number of frames buffered in the NDPluginCircularBuff plugin
epicsEnvSet("CBUFFS", "500")
# Define NELEMENTS to be enough for a 2592x1944x3 (color) image
epicsEnvSet("NELEMENTS", "15116544")
# ADSpinnakerConfig(const char *portName, const char *cameraId, int numSPBuffers,
#                   size_t maxMemory, int priority, int stackSize)
ADSpinnakerConfig("$(PORT)", $(CAMERA_ID),40)
# Main database.  This just loads and modifies ADBase.template
dbLoadRecords("$(ADSPINNAKER)/db/spinnaker.template", "P=$(PREFIX),R=cam1:,PORT=$(PORT)")
# Load the autogenerated file of GenICam features
dbLoadRecords("$(GENICAM_DB_FILE)", "P=$(PREFIX),R=cam1:,PORT=$(PORT)")
# Create a standard arrays plugin
NDStdArraysConfigure("Image1", 5, 0, "$(PORT)", 0, 0)
# Use this line for 8-bit or 16-bit data
dbLoadRecords("$(ADCORE)/db/NDStdArrays.template", "P=$(PREFIX),R=image1:,PORT=Image1,ADDR=0,TIMEOUT=1,NDARRAY_PORT=$(PORT),TYPE=Int16,FTVL=SHORT,NELEMENTS=$(NELEMENTS)")

With the trigger scanning off, I see also with top and initial small increase in memory consumption, and then a flattening at ~95 mbytes (5 sec interval):

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
10795 epics     20   0 2754276  90824  38248 S  73.3  2.4   0:16.13 st.cmd
10795 epics     20   0 2754276  91124  38248 S  66.2  2.4   0:19.44 st.cmd
10795 epics     20   0 2754276  91812  38248 S  62.9  2.4   0:22.59 st.cmd
10795 epics     20   0 2754276  92524  38248 S  67.0  2.4   0:25.94 st.cmd
10795 epics     20   0 2746080  92756  38248 S  67.1  2.4   0:29.30 st.cmd
10795 epics     20   0 2754276  92736  38248 S  67.0  2.4   0:32.65 st.cmd
10795 epics     20   0 2746080  92760  38248 S  66.6  2.4   0:35.98 st.cmd
10795 epics     20   0 2754276  92776  38248 S  63.9  2.4   0:39.18 st.cmd
10795 epics     20   0 2754276  92756  38248 S  64.6  2.4   0:42.41 st.cmd
10795 epics     20   0 2746080  92764  38248 S  66.7  2.4   0:45.75 st.cmd
10795 epics     20   0 2754276  93064  38248 S  65.6  2.4   0:49.03 st.cmd
10795 epics     20   0 2746080  93040  38248 S  63.8  2.4   0:52.22 st.cmd
10795 epics     20   0 2754276  93016  38248 S  63.5  2.4   0:55.40 st.cmd
10795 epics     20   0 2754564  93636  38248 S  66.6  2.4   0:58.73 st.cmd
10795 epics     20   0 2746368  93604  38248 S  65.1  2.4   1:01.99 st.cmd
10795 epics     20   0 2754564  93892  38248 S  65.8  2.4   1:05.28 st.cmd
10795 epics     20   0 2746368  93860  38248 S  65.3  2.4   1:08.55 st.cmd
10795 epics     20   0 2754564  94096  38248 S  66.2  2.5   1:11.86 st.cmd
10795 epics     20   0 2754564  94248  38248 S  66.1  2.5   1:15.17 st.cmd
10795 epics     20   0 2746368  94216  38248 S  64.4  2.5   1:18.39 st.cmd
10795 epics     20   0 2754564  94524  38248 S  63.9  2.5   1:21.59 st.cmd
10795 epics     20   0 2746368  94492  38248 S  65.6  2.5   1:24.87 st.cmd
10795 epics     20   0 2754564  94468  38248 S  66.1  2.5   1:28.18 st.cmd
10795 epics     20   0 2754564  94444  38248 S  65.8  2.5   1:31.47 st.cmd
10795 epics     20   0 2746368  94412  38248 S  65.8  2.5   1:34.76 st.cmd
10795 epics     20   0 2754564  94388  38248 S  67.3  2.5   1:38.13 st.cmd
10795 epics     20   0 2746368  94732  38248 S  66.4  2.5   1:41.45 st.cmd
10795 epics     20   0 2754564  94972  38248 S  64.3  2.5   1:44.67 st.cmd
10795 epics     20   0 2754564  94948  38248 S  66.4  2.5   1:47.99 st.cmd
10795 epics     20   0 2746368  94916  38248 S  64.4  2.5   1:51.21 st.cmd
10795 epics     20   0 2754564  94984  38248 S  63.8  2.5   1:54.40 st.cmd
10795 epics     20   0 2746368  94952  38248 S  66.7  2.5   1:57.74 st.cmd
10795 epics     20   0 2754564  95248  38248 S  67.2  2.5   2:01.10 st.cmd
10795 epics     20   0 2754564  95224  38248 S  66.9  2.5   2:04.45 st.cmd
10795 epics     20   0 2746368  95192  38248 S  66.4  2.5   2:07.77 st.cmd
10795 epics     20   0 2754564  95488  38248 S  66.6  2.5   2:11.10 st.cmd
10795 epics     20   0 2746368  95456  38248 S  67.3  2.5   2:14.47 st.cmd
10795 epics     20   0 2754564  95432  38248 S  65.1  2.5   2:17.73 st.cmd
10795 epics     20   0 2746368  95604  38248 S  65.4  2.5   2:21.00 st.cmd
10795 epics     20   0 2754564  95580  38248 S  66.7  2.5   2:24.34 st.cmd
10795 epics     20   0 2754564  95556  38248 S  66.6  2.5   2:27.67 st.cmd
10795 epics     20   0 2746368  95528  38248 S  66.0  2.5   2:30.97 st.cmd
10795 epics     20   0 2754564  95504  38248 S  67.7  2.5   2:34.36 st.cmd
10795 epics     20   0 2746368  95472  38248 S  65.2  2.5   2:37.62 st.cmd
10795 epics     20   0 2754564  95448  38248 S  65.4  2.5   2:40.89 st.cmd
10795 epics     20   0 2754564  95424  38248 S  66.2  2.5   2:44.20 st.cmd
10795 epics     20   0 2746368  95392  38248 S  65.1  2.5   2:47.46 st.cmd
10795 epics     20   0 2754564  95628  38248 S  63.8  2.5   2:50.65 st.cmd
10795 epics     20   0 2746368  95596  38248 S  64.9  2.5   2:53.90 st.cmd
10795 epics     20   0 2754564  95572  38248 S  65.8  2.5   2:57.19 st.cmd
10795 epics     20   0 2754564  95548  38248 S  67.4  2.5   3:00.56 st.cmd
10795 epics     20   0 2746368  95516  38248 S  64.3  2.5   3:03.78 st.cmd
10795 epics     20   0 2754564  95492  38248 S  65.6  2.5   3:07.06 st.cmd
10795 epics     20   0 2746368  95460  38248 S  64.9  2.5   3:10.31 st.cmd
10795 epics     20   0 2754564  95436  38248 S  65.8  2.5   3:13.60 st.cmd
10795 epics     20   0 2754564  95412  38248 S  64.3  2.5   3:16.82 st.cmd
10795 epics     20   0 2754564  95380  38248 S  64.0  2.5   3:20.02 st.cmd
10795 epics     20   0 2754564  95620  38248 S  63.4  2.5   3:23.19 st.cmd
10795 epics     20   0 2746368  95588  38248 S  62.9  2.5   3:26.34 st.cmd
10795 epics     20   0 2754564  95564  38248 S  64.2  2.5   3:29.55 st.cmd
10795 epics     20   0 2746368  95540  38248 S  64.9  2.5   3:32.80 st.cmd
10795 epics     20   0 2754564  95516  38248 S  67.0  2.5   3:36.15 st.cmd
10795 epics     20   0 2754564  95492  38248 S  66.1  2.5   3:39.46 st.cmd
10795 epics     20   0 2746368  95460  38248 S  65.8  2.5   3:42.75 st.cmd
10795 epics     20   0 2754564  95436  38248 S  65.4  2.5   3:46.02 st.cmd
10795 epics     20   0 2754564  95616  38248 S  66.6  2.5   3:49.35 st.cmd
10795 epics     20   0 2746368  95584  38248 S  66.1  2.5   3:52.66 st.cmd
10795 epics     20   0 2754564  95560  38248 S  66.4  2.5   3:55.98 st.cmd
10795 epics     20   0 2746368  95528  38248 S  65.7  2.5   3:59.27 st.cmd
10795 epics     20   0 2754564  95504  38248 S  64.2  2.5   4:02.48 st.cmd
10795 epics     20   0 2754564  95480  38248 S  65.2  2.5   4:05.74 st.cmd
10795 epics     20   0 2746368  95624  38248 S  67.5  2.5   4:09.12 st.cmd
10795 epics     20   0 2754564  95600  38248 S  66.8  2.5   4:12.46 st.cmd
10795 epics     20   0 2746368  95568  38248 S  64.7  2.5   4:15.70 st.cmd

Now I start manually clicking the button 500 times:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
10795 epics     20   0 2746368  95424  38248 S  66.7  2.5   5:56.66 st.cmd
10795 epics     20   0 2754564  95400  38248 S  64.1  2.5   5:59.87 st.cmd
10795 epics     20   0 2754564  95636  38248 S  66.6  2.5   6:03.20 st.cmd
10795 epics     20   0 2746368  95620  38248 S  67.8  2.5   6:06.59 st.cmd
10795 epics     20   0 2754564  95604  38248 S  67.9  2.5   6:09.99 st.cmd
10795 epics     20   0 2746368  95600  38248 S  68.0  2.5   6:13.39 st.cmd
10795 epics     20   0 2754564  95704  38248 S  68.1  2.5   6:16.80 st.cmd
10795 epics     20   0 2754564  95680  38248 S  68.2  2.5   6:20.21 st.cmd
10795 epics     20   0 2754564  95648  38248 S  69.2  2.5   6:23.67 st.cmd
10795 epics     20   0 2754564  95624  38248 S  68.5  2.5   6:27.10 st.cmd
10795 epics     20   0 2746368  95592  38248 S  67.8  2.5   6:30.49 st.cmd
10795 epics     20   0 2754564  95568  38248 S  67.9  2.5   6:33.89 st.cmd
10795 epics     20   0 2746368  95800  38248 S  68.2  2.5   6:37.30 st.cmd
10795 epics     20   0 2754564  95776  38248 S  67.5  2.5   6:40.68 st.cmd
10795 epics     20   0 2754564  95756  38248 S  68.2  2.5   6:44.09 st.cmd
10795 epics     20   0 2746368  95728  38248 S  68.2  2.5   6:47.50 st.cmd
10795 epics     20   0 2754564  95704  38248 S  68.3  2.5   6:50.92 st.cmd
10795 epics     20   0 2746368  95672  38248 S  67.6  2.5   6:54.30 st.cmd
10795 epics     20   0 2754564  95660  38248 S  67.9  2.5   6:57.70 st.cmd
10795 epics     20   0 2754564  95664  38248 S  68.2  2.5   7:01.11 st.cmd
10795 epics     20   0 2746368  95884  38248 S  67.4  2.5   7:04.48 st.cmd
10795 epics     20   0 2754564  95876  38248 S  68.3  2.5   7:07.90 st.cmd
10795 epics     20   0 2746368  95856  38248 S  67.8  2.5   7:11.29 st.cmd
10795 epics     20   0 2754564  95844  38248 S  68.1  2.5   7:14.70 st.cmd
10795 epics     20   0 2754564  96096  38248 S  69.0  2.5   7:18.15 st.cmd
10795 epics     20   0 2746368  96080  38248 S  68.2  2.5   7:21.56 st.cmd
10795 epics     20   0 2754564  96072  38248 S  67.9  2.5   7:24.96 st.cmd
10795 epics     20   0 2746368  96220  38248 S  68.2  2.5   7:28.37 st.cmd
10795 epics     20   0 2754564  96208  38248 S  67.9  2.5   7:31.77 st.cmd
10795 epics     20   0 2754564  96208  38248 S  67.6  2.5   7:35.15 st.cmd
10795 epics     20   0 2754564  96260  38248 S  68.3  2.5   7:38.57 st.cmd

That's an increase of about 800 kbyte in RES, no change in VIRT.

I repeat the same with scan at 0.1 s:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
10795 epics     20   0 2754564  96488  38248 S  68.4  2.5   9:01.11 st.cmd
10795 epics     20   0 2746368  96456  38248 S  66.5  2.5   9:04.44 st.cmd
10795 epics     20   0 2754564  96440  38248 S  70.2  2.5   9:07.95 st.cmd
10795 epics     20   0 2746368  96520  38248 S  70.4  2.5   9:11.47 st.cmd
10795 epics     20   0 2754564  96768  38248 S  69.7  2.5   9:14.96 st.cmd
10795 epics     20   0 2754564  96824  38248 S  70.0  2.5   9:18.46 st.cmd
10795 epics     20   0 2754564  96792  38248 S  69.7  2.5   9:21.95 st.cmd
10795 epics     20   0 2754564  96752  38248 S  66.2  2.5   9:25.26 st.cmd
10795 epics     20   0 2746368  96724  38248 S  68.0  2.5   9:28.66 st.cmd
10795 epics     20   0 2754564  96700  38248 S  69.7  2.5   9:32.15 st.cmd
10795 epics     20   0 2746368  96668  38248 S  70.0  2.5   9:35.65 st.cmd
10795 epics     20   0 2754564  96644  38248 S  69.7  2.5   9:39.14 st.cmd
10795 epics     20   0 2754564  96848  38248 S  70.2  2.5   9:42.65 st.cmd
10795 epics     20   0 2746368  96812  38248 S  69.2  2.5   9:46.11 st.cmd
10795 epics     20   0 2754564  96828  38248 S  70.1  2.5   9:49.62 st.cmd
10795 epics     20   0 2746368  96796  38248 S  69.6  2.5   9:53.10 st.cmd
10795 epics     20   0 2754564  96820  38248 S  70.5  2.5   9:56.63 st.cmd
10795 epics     20   0 2754564  96832  38248 S  69.6  2.5  10:00.11 st.cmd
10795 epics     20   0 2746368  96808  38248 S  70.5  2.5  10:03.64 st.cmd
10795 epics     20   0 2754564  97148  38248 S  69.6  2.5  10:07.12 st.cmd
10795 epics     20   0 2746368  97612  38248 S  69.7  2.5  10:10.61 st.cmd
10795 epics     20   0 2754564  97668  38248 S  69.6  2.5  10:14.09 st.cmd
10795 epics     20   0 2754564  97644  38248 S  69.6  2.5  10:17.57 st.cmd
10795 epics     20   0 2746368  97700  38248 S  69.9  2.5  10:21.07 st.cmd
10795 epics     20   0 2754564  97676  38248 S  70.0  2.5  10:24.57 st.cmd
10795 epics     20   0 2746368  97644  38248 S  69.9  2.5  10:28.07 st.cmd
10795 epics     20   0 2754564  97620  38248 S  70.2  2.5  10:31.58 st.cmd
10795 epics     20   0 2754564  97596  38248 S  69.2  2.5  10:35.04 st.cmd
10795 epics     20   0 2746368  97564  38248 S  69.7  2.5  10:38.53 st.cmd
10795 epics     20   0 2754564  97800  38248 S  70.0  2.6  10:42.03 st.cmd
10795 epics     20   0 2746368  97768  38248 S  69.3  2.6  10:45.50 st.cmd
10795 epics     20   0 2754564  97780  38248 S  69.0  2.6  10:48.95 st.cmd
10795 epics     20   0 2754564  97772  38248 S  70.6  2.6  10:52.48 st.cmd
10795 epics     20   0 2754564  97776  38248 S  70.3  2.6  10:56.00 st.cmd
10795 epics     20   0 2754564  97768  38248 S  69.6  2.6  10:59.48 st.cmd
10795 epics     20   0 2746368  97748  38248 S  70.3  2.6  11:03.00 st.cmd
10795 epics     20   0 2754564  97740  38248 S  69.8  2.6  11:06.49 st.cmd

RES continuously grows, VIRT is not affected, as far as I can tell.

MarkRivers commented 3 years ago

epics> 2021/05/19 08:55:05.510 Param[GC_FRAMERATE] GenICamFeature::write: node AcquisitionFrameRate is not writable I also cannot change the frame rate in the independently in the FLIR software, so I think it is a limitation of the camera.

I think that your problem is that some FLIR cameras have a GenICam feature called AcquisitionFrameRateAuto. I suspect that feature is set to On on your camera. In that case you cannot control the frame rate. Look at the second medm screen shot on this page for an example with the BFLY-PGE-20E4C camera: https://areadetector.github.io/master/ADGenICam/ADGenICam.html. Note that AcquisitionFrameRateAuto is set to Off on that screen. That is what you need to do.

RES continuously grows, VIRT is not affected, as far as I can tell.

Then that is not a memory leak. Memory leaks cause VIRT to grow, because that is the total memory that has been allocated but not freed. RES is resident memory. The OS controls that to optimize performance, and so it can grow as new memory locations are written to.

rettigl commented 3 years ago

I think that your problem is that some FLIR cameras have a GenICam feature called AcquisitionFrameRateAuto. I suspect that feature is set to On on your camera. In that case you cannot control the frame rate. Look at the second medm screen shot on this page for an example with the BFLY-PGE-20E4C camera: https://areadetector.github.io/master/ADGenICam/ADGenICam.html. Note that AcquisitionFrameRateAuto is set to Off on that screen. That is what you need to do. Yes, you are right that there is this AquisitionFramRateAuto setting. ADSpinnaker_Feature1 This indeed allows some control, but still not what I need. Here is what I observe:

Then that is not a memory leak. Memory leaks cause VIRT to grow, because that is the total memory that has been allocated but not freed. RES is resident memory. The OS controls that to optimize performance, and so it can grow as new memory locations are written to.

RES is the actual physical memory used by the program, no? So whatever it is, it's causing the IOC to use more and more physical memory, and the computer with 3-4 of these iocs running for several days went out of memory and started to show huge load and became ineccessible, that's how I became aware of the problem. So definitely something I need to solve if I want to keep using this approach. And so far, I don't see a different one with my camera.

MarkRivers commented 3 years ago

If I set AquisitionFramRateAuto to off and AquisitionFramRateEnable to on, and also Exposure auto to off, I can control expsoure time and Frame rate independently, but, with the following limitations: Maximal exposure time is capped at 66 ms, and minimal frame rate at 15.148 Hz:

Are you sure that you are setting the parameters in the correct order? For example if FrameRate=20 then you cannot set the exposure time above 0.05. But if you first decrease the frame rate to 2 then you should be able to increase the exposure time to 0.5. I just tested this on a Blackfly camera with ADSpinnaker and it worked fine.

You had not previously reported that RES kept increasing forever and was causing a problem. That does seem incorrect. I will test and see if I can reproduce the problem.

MarkRivers commented 3 years ago

I ran 2 tests with my BlackFly camera, 1600x1200 pixels, Mono8 mode.

So I can reproduce the problem, RES does not increase when using the internal frame rate, but it does increase when using the software trigger.

There is really no code in ADSpinnaker that does anything when the TriggerSoftware record is processed. It just writes to the TriggerSoftware GenICam feature, and then everything is done in the Spinnaker library. I don't see any place there can be a memory leak in the areaDetector code.

I suggest running the FLIR viewer and seeing whether RES also increases when you use SoftwareTrigger with that.

rettigl commented 3 years ago

Hi Mark, thanks again for taking this on. Without finding time yet to do more tests myself, I had a glance at the release notes of the Spinnaker SDK. In version 2.3.0.77, they mention a fixed issue:

The ADSpinnaker currently used 2.0.0.147, no? so this could be related to the problem. I will have a closer look once I find time.

MarkRivers commented 3 years ago

I will have a closer look once I find time.

I would suggest also trying to see if you can make FrameRate and AcquireTime work with normal triggered acquisition. I am able to control those over the full range on a BlackFly as long as I change them in the correct order.

MarkRivers commented 3 years ago

I agree that updating the Spinnaker SDK is the first thing to try.

rettigl commented 3 years ago

Are you sure that you are setting the parameters in the correct order? For example if FrameRate=20 then you cannot set the exposure time above 0.05. But if you first decrease the frame rate to 2 then you should be able to increase the exposure time to 0.5. I just tested this on a Blackfly camera with ADSpinnaker and it worked fine.

I tried again what you suggested. AquisitionFramRateAuto=0, AquisitionFramRateEnable=1 Now, when I try to set the frame rate to e.g. 10 Hz, it will set it to 15.148 Hz. See my screenshot above. No matter what I do. And I cannot increase the Exposure time above 0.066 s. What I can reproduce is that if I set frame rate e.g. to 40 Hz, then the highest exposure time is 0.02 s. But I think the limitation in Frame rate comes from the camera model.

rettigl commented 3 years ago

Just changing the libraries to the new version in the ioc did not do the trick, apparently they changed something in the SDK. It builds, but produces the following error when running:

ADSpinnakerConfig("SP1", 16119088,40)
c++ error: Spinnaker: System instance cannot be acquired. Could not load producer. Make sure that the environment variable FLIR_GENTL64_CTI exists, and points to the location of the file FLIR_GenTL.cti [-1012]

Just setting this variable to the file copied from the respective package also does not help yet...

MarkRivers commented 3 years ago

I have created a new spinnaker2.4 branch in this repository. It contains the files from the latest version of the Spinnaker SDK, 2.4.0.143. It builds and runs fine for me on Ubuntu 18.

I tested first acquiring 1000 frames in TriggerMode=Off, 0.1 second acquisition. When VIRT RES
Before 7708904 317868
After 7708904 317868

There was no increase in VIRT or RES during this acquisition.

I then repeated the testing using TriggerMode=On, TriggerSource=Software, and an IDL script that wrote 1 to the TriggerSoftware PV 1000 times.

When VIRT RES
Before 7708904 318048
After 7708904 320508

RES has increased by 2460 KB in 1000 images, or about 2.5 KB per image. This is the same as with the previous version of Spinnaker, 2.0.0.147. So the new version of Spinnaker does not seem to have fixed the problem.

MarkRivers commented 3 years ago

Run on Windows, SDK 2.4.0.144, Trigger=Off.

Before

Name    PID Status  User name   CPU Memory (private working set)    Commit size Description
spinnakerApp.exe    11632   Running rivers  00  264,676 K   303,500 K   SpinnakerApp

After

Name    PID Status  User name   CPU Memory (private working set)    Commit size Description
spinnakerApp.exe    11632   Running rivers  00  264,636 K   303,444 K   SpinnakerApp

So there is not increase in private working set of Commit size when TriggerMode=Off.

I then repeated the testing using TriggerMode=On, TriggerSource=Software, and an IDL script that wrote 1 to the TriggerSoftware PV 1000 times.

Before

Name    PID Status  User name   CPU Memory (private working set)    Commit size Description
spinnakerApp.exe    11632   Running rivers  01  264,636 K   303,436 K   SpinnakerApp

After

Name    PID Status  User name   CPU Memory (private working set)    Commit size Description
spinnakerApp.exe    11632   Running rivers  00  266,348 K   305,184 K   SpinnakerApp

So both the private working set and the Commit size increased by about 1700 KB for 1000 images, or about 1.7 KB per image.

It seems that this problem occurs on both Linux and Windows.

rettigl commented 3 years ago

I've tested with a small program derived from the BufferHandling.cpp delivered with the SDK/also in ADSpinnaker. I configured it to just trigger and read one camera 1000x, and here I saw no memory increase during the read loop, just some small increase before.

I also could not reproduce the problem with the spinView QT application. So maybe the problem is not inside the SDK after all? Here is the code I used: TriggerDebug.zip

rettigl commented 3 years ago

I tested the behavior with ADAravis, and observed the same thing. So it appears likely to me this is something rooted in the ADGenICam code, rather than in the code of the SDK.

MarkRivers commented 3 years ago

I tested the behavior with ADAravis, and observed the same thing.

That's what I was going to test next, so thanks for doing it,

I have done another test which I believe proves that the issue has nothing to do with the TriggerSoftware PV, or even with image acquisition at all. I wrote a script which continuously writes 0.05 to the AcquireTime PV at about 100Hz, when TriggerMode=Off and acquisition is stopped. I see a growth in RES that continues as long as the script is running. The ReadStatus.SCAN PV was Passive when doing this.

I can also reproduce the problem by simply setting ReadStatus.SCAN to 0.1 second. I see RES increase by about 256 KB every few 10s of seconds.

This looks like a memory leak in the code that reads GenICam feature values.

MarkRivers commented 3 years ago

I believe I have fixed the problem. The problem occurred with GenICam enum features. The main problem was that epicsStrDup was being called when it did not need to be, and the duplicated string was never freed. I also fixed an issue where unneeded enum callbacks were being done, and I made the code internally consistent regarding when an enum feature is set to have a single "N.A." state, i.e. when it is not implemented, or is not available, or is not readable.

I have pushed the fix to the master branch.

MarkRivers commented 3 years ago

I have merged the spinnaker2.4 branch into master, so it is now using SDK version 2.4.0.143 (Linux) and 2.4.0.144 (Windows).

rettigl commented 3 years ago

I believe I have fixed the problem. The problem occurred with GenICam enum features. The main problem was that epicsStrDup was being called when it did not need to be, and the duplicated string was never freed. I also fixed an issue where unneeded enum callbacks were being done, and I made the code internally consistent regarding when an enum feature is set to have a single "N.A." state, i.e. when it is not implemented, or is not available, or is not readable.

I have pushed the fix to the master branch.

Great, thanks for looking into this. I will test this and let you know.

I have merged the spinnaker2.4 branch into master, so it is now using SDK version 2.4.0.143 (Linux) and 2.4.0.144 (Windows).

I compiled the new 2.4 version of the ADSpinnaker on linux, but still could not get it to run with my camera. I get a segfault:

# ADSpinnakerConfig(const char *portName, const char *cameraId, int numSPBuffers,
#                   size_t maxMemory, int priority, int stackSize)
ADSpinnakerConfig("SP1", 16119088,1)
Segmentation fault (core dumped)

(independent on the stackSize)

MarkRivers commented 3 years ago

ADSpinnakerConfig("SP1", 16119088,1)

In that command you are setting numSPBuffers=1. Try removing the last argument completely, which will then use the default of 100 buffers.

If it still crashes then run the IOC under gdb

gdb ../../bin/linux-x86_64/spinnakerApp
run st.cmd
(wait to get the segfault)
backtrace
MarkRivers commented 3 years ago

@rettigl be sure to do the following when you rebuild the new release of ADSpinnaker:

make clean uninstall
make -sj
MarkRivers commented 3 years ago

I forgot to mention that the fix for the memory leak is not in ADSpinnaker, it is in ADGenICam. You also need to get the master branch of ADGenICam and build that before you build ADSpinnaker.

rettigl commented 3 years ago

@rettigl be sure to do the following when you rebuild the new release of ADSpinnaker:

make clean uninstall
make -sj

Thanks, that apparently did the trick. I did a "make clean" before, but not uninstall. Did not know this was necessary... Now it runs.

In that command you are setting numSPBuffers=1. Try removing the last argument completely, which will then use the default of 100 buffers.

With my camera, I experienced that the default of 100 Framebuffers does not work. Depending on settings, only values of 20-40 or so work:

ADSpinnakerConfig("SP1", 16119088)
2021/05/26 15:58:12.649 ADSpinnaker::connectCamera exception Spinnaker: GenICam::OutOfRangeException= Value = 100 must be equal or smaller than Max = 26. : OutOfRangeException thrown in node 'StreamBufferCountManual' while calling 'StreamBufferCountManual.SetValue()' (file 'IntegerT.h', line 79) [-2002]
2021/05/26 15:58:12.649 ADSpinnaker:ADSpinnaker:  camera connection failed (3)

I forgot to mention that the fix for the memory leak is not in ADSpinnaker, it is in ADGenICam. You also need to get the master branch of ADGenICam and build that before you build ADSpinnaker.

Thanks, this I surely did before, however also without uninstall, but seemed to have worked.

The memory consumtion looks good now, this is in the triggered mode I used before: w_fixed_ADGenICam

I suppose we can close this then.

MarkRivers commented 3 years ago

I did a "make clean" before, but not uninstall. Did not know this was necessary...

"uninstall" is not normally necessary. But in this case the vendor spinnaker library in spinnakerSupport/os/linux-x86_64/ uses the normal Linux soft links. When gnumake installs the files the soft links are lost. gnumake was probably confused about the file dates, and so it did not install the new spinnaker library when it should have. "uninstall" deletes the ADSpinnaker/lib directory and that fixes it.