FairRootGroup / FairMQ

C++ Message Queuing Library and Framework
GNU Lesser General Public License v3.0
87 stars 34 forks source link

Resume/Interrupt transports consistently #460

Closed rbx closed 1 year ago

rbx commented 1 year ago

Solves #428.

rbx commented 1 year ago

The failures happen because there is a race:

  1. Device::ResetWrapper() is clearning the transports container.
  2. Control::RunShutdownSequence() of a plugin is calling ChangeDeviceState(DeviceStateTransition::Auto);, which leads Device::InterruptTransports(); via SubscribeToNewTransition(. So, one is accessing transports while the other is deleting it.

For now just some notes:

dennisklein commented 1 year ago

So, one is accessing transports while the other is deleting it.

I guess we should synchronize those, shouldn't we?

rbx commented 1 year ago

So, one is accessing transports while the other is deleting it.

I guess we should synchronize those, shouldn't we?

Yeah. Also there is another race condition: 2023-02-22 17_24_18-Resume_Interrupt transports consistently by rbx · Pull Request #460 · FairRootGr 2023-02-22 17_24_37-Resume_Interrupt transports consistently by rbx · Pull Request #460 · FairRootGr