Closed jlashner closed 3 years ago
Interesting, did this also occur before the recent change to the crossbar config?
That's a good question, maybe it's an autoping timeout. I'm not sure, I'd have to switch back and reboot crossbar to test.
Seems kind of weird that it's exiting exactly when it's able to reconnect to the G3NetworkSender. Btw, the "G3Reader connection established" messages occur when the smurf-streamer is shut down, and the "Writing to file..." message at 2:03:55 is when it starts back up.
It might be a valuable test. My thought was that something in the reconnection might be blocking the main event loop when it shouldn't be and the ping doesn't happen in time for the timeout. Is the timeout set to 5 seconds?
It's also a bit strange there are 12 "G3Reader connection established" messages.
Just to keep up on this. After talking on the phone it was realized this is on an OCS version 0.6.0+1.g569b428
. The system should be updated to the latest version, which includes crossbar reconnection, and monitored for this event.
This continues to happen when the streamer is restarted even with the newest version of socs, so I'm reopening.
I was also able to further test this at UCSD. It looks like that now, if the recorder is connected to an existing smurf-streamer, then the smurf-streamer restarts, the smurf-recorder will briefly lose connection to crossbar after the streamer comes back online. This doesn't effect the main processing thread, so I think it's harmless. I have also verified that the agent is now able to successfully re-connect to crossbar after a few seconds without interrupting the data-processing operation, thanks to https://github.com/simonsobs/ocs/pull/180, so I'm going to close this issue.
Looks like the smurf-recorder at UCSD is shutting down whenever I run jackhammer soft_reset. Strangely it seems like it is handling the disconnection/reconnection well, so I'm not sure exactly why it's crashing... Here's my full output