msr-consulting / exscalabar_server

Repository for the EXSCALABAR server.
http://www.msrconsults.com/ukmet-gh/exscalabar
0 stars 1 forks source link

Data file time writing issue #173

Closed JustinLangridge closed 7 years ago

JustinLangridge commented 7 years ago

This is an intermittent problem, whereby it the time stamp in the data file is missing seconds or writing the same time stamp twice.

It is not clear if this problem only impacts the time stamp or if also impacts the actual data as well.

For example see this file: test_20170116_150819.txt

We have:

missing 0.0 seconds of data at 2017-01-16 15:03:29.640000 missing 2.0 seconds of data at 2017-01-16 15:03:31.640000 missing 0.0 seconds of data at 2017-01-16 15:03:33.640000 missing 2.0 seconds of data at 2017-01-16 15:03:33.640000 missing 0.0 seconds of data at 2017-01-16 15:03:35.640000 missing 2.0 seconds of data at 2017-01-16 15:03:35.640000 missing 0.0 seconds of data at 2017-01-16 15:03:37.640000 missing 2.0 seconds of data at 2017-01-16 15:03:37.640000 missing 0.0 seconds of data at 2017-01-16 15:03:40.640000 missing 2.0 seconds of data at 2017-01-16 15:03:44.640000 missing 0.0 seconds of data at 2017-01-16 15:03:46.640000 missing 2.0 seconds of data at 2017-01-16 15:03:46.640000 missing 0.0 seconds of data at 2017-01-16 15:03:48.640000 missing 2.0 seconds of data at 2017-01-16 15:03:51.640000 missing 0.0 seconds of data at 2017-01-16 15:03:53.640000 missing 2.0 seconds of data at 2017-01-16 15:03:53.640000 missing 0.0 seconds of data at 2017-01-16 15:03:58.640000 missing 2.0 seconds of data at 2017-01-16 15:03:59.640000 missing 0.0 seconds of data at 2017-01-16 15:04:01.640000

Please could this issue be put up the priority list. It woudl be good to sort it before our next deployment (coming up in 3 weeks time).

datid commented 7 years ago

Is this caused by having different timed loops producing and consuming the data. If it is it seems like it is quite a fundamental issue - whether just tying them together would be enough to fix this I am not sure.

Well I have added some dodgy code to link the timing of Common.vi and Send Write Main MSG.vi so that the file writing is trigged 0.5 seconds after the time stamp is created. It is a horrible fudge.

The data superficially looks better - but this isn't the way to do it..

test_20170201_162007.txt

Am I missing something. It seems that the Update Data messages put the data in the DVR? which basically gets written to file approx once a second. There is no checking all the data for one second are there, or that all the data in the DVR are from the same second.

lo-co commented 7 years ago

OK - I see a comment in the Controller::Send Write Main MSG that says

This is very dodgy don't like it at all, file writing should be triggered when the data is ready not arbitary timing...

So, first off, there is nothing "arbitrary" about this timing and it is by no means "dodgy". We trigger the system to write data at 1 Hz intervals. You have to remember, there is always some data ready to be written. If you want to trigger a write off of a single data source, then you have to ask - which one? Both the CRDS and PAS acquire at 1 Hz, but they are not synchronous AND never will be. The reason for this is simple - we have some indeterminate periods where we are changing things in the system to allow conditions to come to steady state (such as with the PAS speaker) or to handle errors (as when we lose the lock on the CRD laser). A DVR is memory and that memory is always occupied (never cleared) - we have absolutely no reason to check to see if the data is recent because it is likely at some point that IT WON'T BE. DO NOT TIE THE MESSAGE GENERATOR TO SOMETHING ARBITRARY!!!!! As I said before, the writing is NOT arbitrary and is very intentional.
If you look closely at the first file that Justin provided you will notice that what is likely happening is that the data produced by Controller::Common is duplicated but that is not the case for the rest of the data. Now, we have a couple of reasonable options:

I will look into this and see what is going on.

lo-co commented 7 years ago

So, @datid - looked at your solution. I am not sure why you think this solution is dodgy. This is the third solution that I put up there. Seems just fine to me; what's the problem?

lo-co commented 7 years ago

OK...cleaned up the Controller::Common and can't figure out why that would have backed up. Likely an issue that we were having is jitter, i.e. two loops starting around the same time and small differences in execution rates causing the timestamp not to update (this is why you see dt=0 in some cases). I think that your solution is perfect (offset the two timed loops). This will guarantee correct spacing. I am going to close this for now. Raise it again if this becomes an issue.