msr-consulting / exscalabar_server

Repository for the EXSCALABAR server.
http://www.msrconsults.com/ukmet-gh/exscalabar
0 stars 1 forks source link

Saving problem on restart #180

Closed datid closed 7 years ago

datid commented 7 years ago

Though the save mode is set to True it sometimes fails to save data.
Could be a registration problem, possibly caused by a shutdown issue

Not all the meerstetters seem to stop properly -

16:38:02.93 [mTEC] New output selection for controller ptdBlue.  Value is now Static OFF.
16:38:02.99 [ct] Removing enqueuer with ID .
16:38:03.05 [instr] Instrument "eCRDS Library" shutting down.
16:38:03.11 [fpga] FPGA session successfully closed.
16:38:03.17 [ptdBlue] Device stopping.
16:38:03.22 [mTEC] New output selection for controller HighTEC.  Value is now Static OFF.
16:38:03.28 [ct] Removing enqueuer with ID pRedDry.
16:38:03.35 [HighTEC] Device stopping.
16:38:03.43 [mTEC] New output selection for controller ptdRed.  Value is now Static OFF.
16:38:03.49 [ct] Removing enqueuer with ID mfcMedRH.
16:38:03.54 [ptdRed] Device stopping.
16:38:03.60 [mTEC] New output selection for controller pBlue.  Value is now Static OFF.
16:38:03.66 [ct] Removing enqueuer with ID pInlet.
16:38:03.72 [pBlue] Device stopping.
16:38:03.78 [mTEC] New output selection for controller pGreen.  Value is now Static OFF.
16:38:03.83 [ct] Removing enqueuer with ID mfcRedTD.
16:38:03.89 [ERROR] 1556: ??? in Session - Root.lvclass:ROOT - Release Session.vi:840005->Meerstetter Lib.lvlib:Meerstetter TEC.lvclass:Stop Device.vi:5690006->DAQ Device.lvlib:Device.lvclass:Stop Core.vi:6180013->Actor Framework.lvlib:Actor.lvclass:Actor Core.vi:5880021->DAQ Device.lvlib:Device.lvclass:Actor Core.vi:5880006->Actor Framework.lvlib:Actor.lvclass:Actor.vi:6640013->Actor Framework.lvlib:Actor.lvclass:Actor.vi.ACBRProxyCaller.1020000E
lo-co commented 7 years ago

So, the more interesting thing would be to see the restart after this. The 1556 error indicates that we have closed something before we intended to. Not certain that this is exactly related to the Meerstetter, but it gives us a starting point. The best way to approach this is to stop and start the program until we get this issue. Can we leave the Meerstetters on possibly so I can look at this?

lo-co commented 7 years ago

Closed lo-co/exscalabar#129 that indicated that the UI was the issue.

lo-co commented 7 years ago

This appears to be caused by the serial session closing. Not sure what is going on, but if the serial connection does not close down properly and you just restart, then you can expect things to hang. The question is why the TECs are not kicking out at start to prevent the system from not coming up properly.

lo-co commented 7 years ago

Adding message to Serial Session::clear Session Data

[serial] Closing port associated with session {session_name}.

The error is not fed into this one so that we will know if an error is thrown when an attempt to close a session is made.

lo-co commented 7 years ago

Added message in Serial Session::Configure Session indicating which port we are kicking off a session for. Message is

[serial] Opening serial port for session {session_name}.

We don't want to handle the serial error here if there is one as this will cause the system to kick off the actor.

lo-co commented 7 years ago

Something looks backwards in the logs here - I get a device stopping message which would be generated by the actor and then I get messages associated with setting up the TECs for shutdown. I will investigate further tomorrow.

Regardless, it looks like the session queue is killed prior to the final TEC shutdown code being called.

lo-co commented 7 years ago

Able to reproduce behavior by starting the program and then cycling the instrument on and off via the UI several times (3). Each time, I attempted to save a file. Third or fourth time it failed.

lo-co commented 7 years ago

So, this is not clear whether this is a device registration issue or whether this is an actual file saving issue. When the save button is hit, the user should get a message that looks like

[file] New file opened at u:\exscalabar\Data\test_20170630_200908.txt.

When it doesn't start saving, you don't get this message.

The message

[ctl] Save state set to {boolean}.

is generated but the Save Data MSG::Do in the Controller library and is sent by the file save web service VI. What this will do is (1) toggle the save state in the CVT and (2) send a message to both the main File Actor and the mirrored FileActor if there is one.

lo-co commented 7 years ago

Sooo...totally broke this with a restart. Now it doesn't even give the message.

Cycling stop-start got the save state back and a file writing (although no mirror).

lo-co commented 7 years ago

Added a Sys Log hook in the serial session. It is opened in the configuration routine and closed in the clear Session Data routine. Now there are some new messages that have the tag [serial].

Still getting errors at the closing of each of the serial sessions. Not sure what is going on here, but if they are not closing properly then this could be a big issue.

lo-co commented 7 years ago

Appears to be an issue with both the MFCs and the Meerstetters shutdown sequence. The Serial Session property is closed with the first call to close the session.

Another issue that I found was that the Vaisala code never called a close on the serial port. This may have been the real issue, but I can not confirm this right now. I have changed the method Vaisala::Close to the override method Vaisala::Stop Device.

lo-co commented 7 years ago

I think the above fix worked. Stopped and started 10x with no fails - wrote to file every time. Will be closing this issue with next check in. Still having an unrelated issue with the MFCs and Meerstetters.