Mismatch between data and descriptor on timeout

martin-gustafsson commented 3 years ago

If the digitizer times out on one iteration in a sweep, the data on file are absent rather than e.g. zeros or NaNs. Then the data descriptor and the data no longer match, and there is no indication of which data are missing. The shape of the dataset suggests that it's complete, but the last row is all zeros.

Error message during experiment, on iteration 1 of 220 of a parameter sweep: auspex-ERROR: 2021-04-08 19:38:44,651 ----> Digitizer myX6 timed out.

Resulting data and descriptor:

data, desc, _ = open_data(16, '.', "q1-raw_int", date="210408")
print(desc.axes[1].points.shape)
print(data.shape)
print(data[-2,:])
print(data[-1,:])

(220,) (220, 1523) [29.7535543 +26.75779667j 21.34156161+34.60425376j 16.25427695+36.56289939j ... 29.83350915+26.67109178j 29.79431326+26.64494398j 29.68503646+26.96899154j] [0.+0.j 0.+0.j 0.+0.j ... 0.+0.j 0.+0.j 0.+0.j]

matthewware commented 3 years ago

This is a symptom of the deeper issues we're chatting about. The right solution is to fix the data pipeline so the card doesn't timeout. But maybe we could explore pre-filling for small datasets.

grahamrow commented 3 years ago

Both arrays are preallocated, and the descriptor information is actually written before the filters run. I guess the question is what should the default behavior be?

I guess I'm confused by your first statement 'the data on file are absent rather than e.g. zeros or NaNs' — it looks like the data that didn't get recorded are indeed zeros. I'd agree that NaNs might be better.

martin-gustafsson commented 3 years ago

Graham, the issue is that the failed measurement was in the first step of the sweep, but it's the last row of data that's missing. The first row of the data array contains the data for the second step of the sweep.

martin-gustafsson commented 3 years ago

I'll try to be more clear: Let's say I have an experiment with one parameter sweep. Then there is effectively one counter for what element in the parameter array to use next and another counter for which row of data to write next. If everything works, those two counters remain in sync and the data array gets completely filled before the experiment terminates.

However, if one iteration in the middle of the sweep results in a digitizer timeout, the parameter step increments but the data counter does not. From that point onward, the data and the descriptor do not match, and when the sweep is finished, there is still one unwritten row at the end of the data array. If there are two timeouts at different points in the sweep, there will be two empty rows at the end of the data array.

You don't see this so easily in sweeps with small linear increments of the parameter, but I was using a random index as a parameter to a sweep, and then it becomes apparent when the data row does not match the parameter value for which it was acquired.

Of course, we want a situation where timeouts never happen, but that seems hard to guarantee. In that case, it's nice if the writer dumps a row's worth of nans in the data array when the error happens. If that's not possible, I think the severity of a digitizer timeout has to be increased from a Warning to an Error, since it corrupts the measured data.

BBN-Q / Auspex

Mismatch between data and descriptor on timeout #492