HarrisonSteel / ChiBio

ChiBio primary operating system.
23 stars 28 forks source link

Is there a way to avoid crashing a whole experiment when a "Failed to recover multiplexer on device" error happens? #16

Open mgalardini opened 1 month ago

mgalardini commented 1 month ago

We have had a few instances of an experiment with multiple reactors running that was interrupted because one reactor (each time a different one) caused a "Failed to recover multiplexer on device" error, which, after 20 tries in the space of a couple of seconds leads to the whole app crashing and restarting. When the experiment is finally restarted the offending reactor works fine.

I don't have enough facility with the way the code handles the communication with this "multiplexer", but it seems to me that there could be ways to "revive" the connection, or at the very least decide to drop one reactor while the others continue to collect data?

Would you happen to have some suggestions here about what to do to solve or mitigate the issue?

Thanks a lot!

HarrisonSteel commented 1 month ago

Hello, If it is failing to connect to the multiplexer, that device is upstream of the reactors themselves so may imply that there is little way around it. I.e. if you can't disconnect the multiplexer from offending reactor X, then you will struggle to reset it to connect to working reactor Y - since disconnecting it requires it work! In later devices we added some hardware reset chip on the board to enable precisely this. If you have that chip (probably likely if you purchased from Labmaker in last couple of years) you might be able to adjust the code so that it cycles the Multiplexer then when it tries to re-connect does not (for the duration of the experiment) try to re-connect to the offending device. But this is a bit dodgy... A better (hardware) fix would be to disconnect the top liquid level sensing ring - at least 80% of the time this is the offender and may be quite trigger-happy if you have high humidity or lids that aren't perfectly sealed (are you using rubber o-rings to improve that seal?). This top moisture sensor can be disconnected by removing the left and write sides of the device, then disconnecting the two wires that connect from the middle level to the underside of the top one.

mgalardini commented 1 month ago

Thanks for the suggestions! We will try with an o-ring and see if this mitigates the issue. I don't think we would be able to implement a way to restart crashed experiments, but if you happen to do we would be happy to give it a try :)