ASYN is not ready for parallel callback processing (and overflows callback queue)

epics-modules / asyn

EPICS module for driver and device support

http://epics-modules.github.io/asyn/

Other

37 stars 75 forks source link

ASYN is not ready for parallel callback processing (and overflows callback queue) #170

Open ralphlange opened 1 year ago

ralphlange commented 1 year ago

ASYN implements an internal FIFO queue between the driver and the device support to help the IOC database cope with bursts of values coming in. The default size is 10 and configurable per record instance ("asyn:FIFO").

The current implementation requests record processing when a new value is pushed into the FIFO queue. To cover the worst case (IOC being really busy), this needs room in the EPICS callback queue for _FIFO_queue_size * no_of_ASYNrecords requests. (10 times the number of ASYN records for the default FIFO size.)

At ITER, we have been forced to set IOCs to callback queue sizes >200k because of this somewhat unpredictable behavior.

An implementation that requests record processing if the FIFO queue is not empty after a value has been pulled from it would only need room in the EPICS callback queue for _no_of_ASYNrecords requests. (Number of ASYN records independent from queue sizes.)

This would lead to a much more predictable behavior on IOCs with many ASYN records.

ralphlange commented 4 months ago

Things are actually worse: This means that ASYN is not ready for callback parallelization.

With the same record multiple times in the callback queue, multiple threads can (and will) start processing the same record, locking each other out through the EPICS lockset mechanism, thus degrading the overall performance.

There might also be some locking issues lurking in ASYN’s Parameter library... At ITER we have seen ASYN port driver performance going noticeably down when parallel callback threads are enabled.

exzombie commented 3 months ago

Discussion @ Codeathon 2024: to make things work with callback parallelization, the asyn internal queue should submit only one callback at a time to the EPICS queue, waiting for its completion before submitting the next one. This avoids flooding the threads with callbacks that lock the same record.

ralphlange commented 3 months ago

Pointers to the mentioned C++ solution used in the OPC UA Device Support. (Just FYI, not overly useful here.)

Update and UpdateQueue are template classes for an update (timestamp plus data plus metadata) and a queue of those. https://github.com/epics-modules/opcua/tree/master/devOpcuaSup

The lower level pushes new data or events into the queue and requests processing if it was the first item (the queue was empty). https://github.com/epics-modules/opcua/blob/126fe487f4243c820aa24deb7143e03512dd93bb/devOpcuaSup/UaSdk/DataElementUaSdk.cpp#L223

After processing, the upper level (device support) requests another processing if the queue is not empty. https://github.com/epics-modules/opcua/blob/126fe487f4243c820aa24deb7143e03512dd93bb/devOpcuaSup/devOpcua.cpp#L621

exzombie commented 3 months ago

At ITER we have seen ASYN port driver performance going noticeably down when parallel callback threads are enabled.

Thinking about this some more, the discussion up to this point was about high interrupt rate which floods the queue with requests pertaining to a single record (or a few of them). However, I'm pretty sure that's not the whole story. The quoted statement refers to a different situation.[^fn] There, the interrupt rate is low (once per second in our tests), but the number of records is large. Moreover, the records are not linked anywhere, so they don't share locks. And yet, increasing the number of parallel callback threads makes things slower.

[^fn]: Unless I'm mistaken and ITER has a bunch of different drivers affected by this, but from the timing of the comment, it seems likely we have the same thing in mind.

ralphlange commented 3 months ago

IPSi.

Yes, you're right. It sounds more like a per-port lock inside ASYN that holds up things.