areaDetector / ADGenICam

areaDetector base class for GenICam cameras.
https://areadetector.github.io/areaDetector/ADGenICam/ADGenICam.html
Other
8 stars 17 forks source link

Performance issue reading features with aravis compared to Spinnaker #1

Closed MarkRivers closed 4 years ago

MarkRivers commented 5 years ago

ADGenICam reads back all features from the camera, including the enum strings and values for enum features, under 2 circumstances:

This is working fine, except for an issue with performance with aravis. The following table compares the times to read all of the features on a number of cameras, using both ADAravis (aravis SDK) and ADSpinnaker (FLIR Spinnaker SDK).

Camera Inteface Spinnaker aravis
BlackflyS 13Y3M GigE 0.037 s 1.55 s
Blackfly 20E4C GigE 0.147 s 1.10 s
Oryx 51S5M 10 GigE 0.056 s 11.8 s
Grasshopper3 23S6M USB3 0.017 s 0.085 s

Note that for USB3 aravis is very fast, only 85 ms to poll all of the features. However, for the GigE cameras it is 7-40 times slower than Spinnaker, and for the 10 GigE camera it is 210 times slower. On the Oryx changing a single feature like Gain introduces a delay of 11.8 seconds before the camera can be streaming again. This is not acceptable.

Questions:

The attached text file shows the debugging trace when reading the features. It is the data used in compiling the table above.

FeaturePollPerformance.txt

thomascobb commented 5 years ago

@EmmanuelP can you shed any light on the figures above? Does Genicam specify which features need to be polled when one is updated, or do you think Spinnaker has some inside knowledge in this area?

EmmanuelP commented 5 years ago

Aravis has a mechanism to cache register data value, implementing <cachable> element.

Register cache can also be invalidated by a change in another node (<pInvalidator> element), or after a timeout (<PollingTime> element).

It would be interresting to have the console output of the poll performance tests with ARV_DEBUG set to genicam:all.

MarkRivers commented 5 years ago

I have set the environment variable ARV_DEBUG to genicam:all, but I don't see any output when it is polling. Is there something else I need to do to enable the debugging output?

EmmanuelP commented 5 years ago

Oh, sorry. The right value is genicam:3.

MarkRivers commented 5 years ago

Thanks. I didn't find the ARV_DEBUG output too helpful, since there are no timestamps. I can post the output if you think it would be useful.

However, I can now see which aravis calls are taking a lot of time.

This is the code where I read the enum values and strings, with some debugging prints added:

asynPrint(mSet->getUser(), ASYN_TRACE_ERROR, 
"calling arv_gc_enumeration_get_available_int_values for %s\n", 
mFeatureName.c_str());
    gint64 *values = arv_gc_enumeration_get_available_int_values(enumeration, &numEnums, NULL);
asynPrint(mSet->getUser(), ASYN_TRACE_ERROR, 
"calling arv_gc_enumeration_get_available_string_values for %s\n", 
mFeatureName.c_str());
    const char **strings = arv_gc_enumeration_get_available_string_values(enumeration, &numEnums, NULL);
asynPrint(mSet->getUser(), ASYN_TRACE_ERROR, 
"done calling arv_gc_enumeration_get_available_string_values for %s, numEnums=%u\n", 
mFeatureName.c_str(), numEnums);

This is the output when reading the TriggerSource and ChunkPixelFormat feature enums for a FLIR Oryx 10 Gbit camera.

2019/06/14 18:08:40.088 calling arv_gc_enumeration_get_available_int_values for TriggerSource
2019/06/14 18:08:40.431 calling arv_gc_enumeration_get_available_string_values for TriggerSource
2019/06/14 18:08:40.794 done calling arv_gc_enumeration_get_available_string_values for TriggerSource, numEnums=15

2019/06/14 18:12:59.551 calling arv_gc_enumeration_get_available_int_values for ChunkPixelFormat
2019/06/14 18:12:59.551 calling arv_gc_enumeration_get_available_string_values for ChunkPixelFormat
2019/06/14 18:12:59.551 done calling arv_gc_enumeration_get_available_string_values for ChunkPixelFormat, numEnums=10

Note that it takes 706 ms to read the 15 enum values and strings for TriggerSource but less than 1 ms to read the 10 enum values and strings for CounterEventSource. Why is TriggerSource so slow, but ChunkPixelFormat so fast?

This is the time to read TriggerSource and ChunkPixelFomat on a FLIR Grasshopper 3 with USB3 interface

2019/06/14 18:06:19.615 calling arv_gc_enumeration_get_available_int_values for TriggerSource
2019/06/14 18:06:19.616 calling arv_gc_enumeration_get_available_string_values for TriggerSource
2019/06/14 18:06:19.616 done calling arv_gc_enumeration_get_available_string_values for TriggerSource, numEnums=4

2019/06/14 18:06:19.683 calling arv_gc_enumeration_get_available_int_values for ChunkPixelFormat
2019/06/14 18:06:19.683 calling arv_gc_enumeration_get_available_string_values for ChunkPixelFormat
2019/06/14 18:06:19.683 done calling arv_gc_enumeration_get_available_string_values for ChunkPixelFormat, numEnums=18

Note that is only takes 1 ms to read the 4 enum values and strings for TriggerSource, and less than 1 ms to read the 18 enum values and string for ChunkPixelFormat. It is never slow to read any of the enum choices.

Why is the USB3 interface so much faster than 10 Gbit?

EmmanuelP commented 5 years ago

Thanks. I didn't find the ARV_DEBUG output too helpful, since there are no timestamps. I can post the output if you think it would be useful.

Yes, please. You may want to extend the debug output using genicam:3,device:3.

Note that it takes 706 ms to read the 15 enum values and strings for TriggerSource but less than 1 ms to read the 10 enum values and strings for CounterEventSource. Why is TriggerSource so slow, but ChunkPixelFormat so fast?

Is this repeatable ? If so, it would be interesting to compare the related genicam data.

Why is the USB3 interface so much faster than 10 Gbit

The acknowledge read timeout is 500ms in the GigEVision case. That could explain what you are seeing. How are the cameras connected ? Directly ? Using a shared network ?

MarkRivers commented 5 years ago

The acknowledge read timeout is 500ms in the GigEVision case. That could explain what you are seeing. How are the cameras connected ? Directly ? Using a shared network ?

It is directly connected to a dedicated 10 Gbit network card, no switch.

Is this repeatable ? If so, it would be interesting to compare the related genicam data.

It is very repeatable, the time for a given feature is always nearly the same.

I have now run my application with both ARV_DEBUG="genicam:3 device:3" and my debugging statements when calling arv_gc_enumeration_get_available_int_values and arv_gc_enumeration_get_available_string_values are called, and when arv_gc_enumeration_get_available_string_values is complete. The output is attached. There is also other output from my application that goes to stdout.

Note that my software is only printing messages when reading enums, so there are lots of messages from GcRegisterNode::_get_integer_value when it is reading other types of features which are not preceded by messages from my software.

If you search for the string "calling" that will find my debuging statements. The first one is at line 5478:

calling arv_gc_enumeration_get_available_int_values for TriggerSource
[GcRegisterNode::_get_integer_value] reglsb = 1, regmsb, 1, lsb = 30, msb = 30
[GcRegisterNode::_get_integer_value] value = 0xe8000005
...

For the TriggerSource attribute those GcRegisterNode::_get_integer_value messages continue until line 7192, so almost 1800 lines until this:

[GcRegisterNode::_get_integer_value] mask  = 0x80000000
[GcRegisterNode::_get_integer_value] address = 0xc8c00, value = 0x1
[GcRegisterNode::_get_integer_value] address = 0xc8c30, value = 0x1
calling arv_gc_enumeration_get_available_string_values for TriggerSource
[GcRegisterNode::_get_integer_value] reglsb = 1, regmsb, 1, lsb = 30, msb = 30
[GcRegisterNode::_get_integer_value] value = 0xe8000005
[GcRegisterNode::_get_integer_value] mask  = 0x40000000

arv_gc_enumeration_get_available_string_values then results in more GcRegisterNode::_get_integer_value messages until line 8912, so about 1700 messages until this:

[GcRegisterNode::_get_integer_value] mask  = 0x80000000
[GcRegisterNode::_get_integer_value] address = 0xc8c00, value = 0x1
[GcRegisterNode::_get_integer_value] address = 0xc8c30, value = 0x1
done calling arv_gc_enumeration_get_available_string_values for TriggerSource, numEnums=15
[GcRegisterNode::_get_integer_value] reglsb = 1, regmsb, 1, lsb = 30, msb = 30
[GcRegisterNode::_get_integer_value] value = 0xe8000005

Interestingly when reading the enum ints and strings for ChunkPixelFormat there are no messages from GcRegisterNode::_get_integer_value

GcRegisterNode::_get_integer_value] mask  = 0x00000400
[GcRegisterNode::_get_integer_value] address = 0x150050, value = 0x1
calling arv_gc_enumeration_get_available_int_values for ChunkPixelFormat
calling arv_gc_enumeration_get_available_string_values for ChunkPixelFormat
done calling arv_gc_enumeration_get_available_string_values for ChunkPixelFormat, numEnums=10
[GcRegisterNode::_get_integer_value] reglsb = 21, regmsb, 21, lsb = 10, msb = 10
[GcRegisterNode::_get_integer_value] value = 0x000077ff

So it appears that the features that are slow have >1500 messages from GcRegisterNode::_get_integer_value while those that are fast result in few or no messages from GcRegisterNode::_get_integer_value.

arv_debug.txt

EmmanuelP commented 5 years ago

I've just had a look at cache handling in arvgcregisternode.c. The implementation is broken in several places. The default value is not correctly set (it is always set to NoCache, which is wrong), and the Invalidator nodes are not used.

I guess that explains the performance issue here.

I have opened an new aravis issue: https://github.com/AravisProject/aravis/issues/253

EmmanuelP commented 5 years ago

Mark, could you send me privately the genicam data (emmanuel gnome org), please ?

Thanks.

MarkRivers commented 5 years ago

Emmanuel has fixed the register caching problem in aravis on the master branch as of late yesterday. The details are in https://github.com/AravisProject/aravis/issues/253

The time to read all of the features on the BlackFlyS in the above table has decreased from 1.55 seconds to 0.036 seconds. The time for the Oryx camera has decreased from 11.8 seconds to 0.096 seconds. These times are now quite comparable to Spinnaker, and are fine.

Thanks Emmanuel!

MarkRivers commented 4 years ago

This is fixed in aravis 0.7.2 and later.