Closed thomasmfish closed 2 years ago
Proposed fix on branch fix-activeCamera-iteration-error
of @thomasmfish fork. The fix makes a copy of the activeLights
and activeCameras
sets before the calls to getExposureTime
and setExposureTime
which avoids the race condition where we are iterating over the set and that set is being changed. See commit thomasmfish/cockpit@f14f509b508caf1339254459b0684ef0afd1418f
I don't think the fix is correct. The copy prevents the exception caused by iterating over a set that is being changed. However, it means that we end up not actually iterating over the currently active cameras/lights. I'd guess some sort of lock would be required to ensure that the set is not modified during this method (or whatever block is the most appropriate for the operation). But I'm not sure under the exact conditions are these methods being called and the implications of missing some cameras/lights.
I agree, I think a queue is probably the way to go, but I don't have time to sort that today and I only have today to deal with all the updates.
I suspect that this is an issue as B24 is using the channel functions that can quickly activate or deactivate a number of lights/cameras etc... I think some kind of lock during channel loading would probably be appropriate and prevent unnecessary GUI updates while the system is changing the state of multiple devices.
I agree, I think a queue is probably the way to go [...]
I thought the same initially. However, grepping through cockpit code, it appears that updateExposureTime
is only called through events, the same that calls toggle
which is the only thing adding and removing cameras/lighs to those sets. So maybe some change how the events are called could prevent that race condition in a more natural way.
[...] but I don't have time to sort that today and I only have today to deal with all the updates.
That is fair enough. But the end result of such approach is that issues get fixed with hacks piled on top of hacks which in the long run lead to something unmanageable.
Looking at the code, the imager interface subscribes twice to the CAMERA_ENABLE event, with different functions. I think we probably just need to sequence the functions so that we only subscribe to the event once and this subscription fires both the toggle camera state and the updateExposureTimes (if required). I'll look at the logic and see if I can find a sensible refactoring.
Both LIGHT_SOURCE_ENABLE and CAMERA_ENABLE events cause calls to both the toggle function and the updateExposureTime function in the imager interface. It seems to me that we can just tag the call to updateExposureTime onto the end of the toggle function, so all the enable/disable is done and then the exposure times are updated which should solve the race condition.
Ian, your comments on the events made me remember that we already have an issue open for this. Closing this as duplicate of #603.
This error has been showing up a lot here. It doesn't seem to have caused problems but it should be fixed. I've got a small update that I'll link to fix it.