labscript-suite-temp / labscript_devices

Module containing labscript suite hardware compatibility, separate from the main programs. Device compatibility is implemented with a plugin architecture, for modularity and extensibility. Each file in this module contains a labscript device class, a BLACS tab class, a BLACS worker class and a runviewer parser class for a particular device. These implement functionality for the device which the programs in question call on when they encounter each device in user labscript code, hdf5 files, or connection tables.
0 stars 0 forks source link

Bug in NI_PCIe_6363 extract measurements #1

Open philipstarkey opened 10 years ago

philipstarkey commented 10 years ago

Original report (archived issue) by Philip Starkey (Bitbucket: pstarkey, GitHub: pstarkey).


Got a traceback today in BLACS

#!python

Exception in worker - Fri Sep 26, 10:15:40 :
Traceback (most recent call last):
  File "C:\pythonlib\labscript_devices\NI_PCIe_6363.py", line 524, in transition_to_manual
    self.extract_measurements(self.device_name)
  File "C:\pythonlib\labscript_devices\NI_PCIe_6363.py", line 575, in extract_measurements
    data['t'] = times
ValueError: could not broadcast input array from shape (23175) into shape (23000)

The logic in extract_measurements is quite complex and there is no debugging information logged so I don't really know why this crash has occurred. First time we've seen it in several years of regularly using this function.

Anyway, the length of data['t'] is set by the length of the extracted analog trace, and the times array is generated slightly differently. I'm guessing there is a subtle bug somewhere in how the indices of the full array (specifying the region to extract) are generated.

About the only useful information I have is that the acquisition rate was set to 1e3 in the labscript file.

philipstarkey commented 5 years ago

Original comment by Shaun Johnstone (Bitbucket: shjohnst, GitHub: shjohnst).


This is still occurring in the new unified “NI_DAQmx” device code.

Traceback (most recent call last):
  File "C:\labscript_suite\labscript_devices\NI_DAQmx\blacs_workers.py", line 592, in transition_to_manual
    self.extract_measurements(raw_data, waits_in_use)
  File "C:\labscript_suite\labscript_devices\NI_DAQmx\blacs_workers.py", line 652, in extract_measurements
    data['t'] = times
ValueError: could not broadcast input array from shape (1409738) into shape (1409500)

It seems to just randomly occur, and after resetting the device it is usually fine to continue running the same shot again. The last few days it’s been happening a few times a day, the only thing I’ve changed in that time is that the total length of the experiments that I’m running has grown, and so more data is being acquired for the duration.

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: philipstarkey, GitHub: philipstarkey).


If it's happening semi-regularly now, could you add log lines that log the value of every time and index variable in that loop with the line it crashes on, as well as the shape of every array used in the loop (put the log lines just before the line that crashes)?

And then post the log that includes values for a shot that works and one that doesn't?

That should help to track down the issue I think

philipstarkey commented 5 years ago

Original comment by Shaun Johnstone (Bitbucket: shjohnst, GitHub: shjohnst).


It seems as though the raw_data array just doesn’t have enough in it? Though then why is it not throwing an index error at the point where we’re getting “values” by indexing “raw_data”?

Successful experiment:

2019-08-12 16:44:45,341 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extract_measurements
2019-08-12 16:44:45,355 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with connection: "b'ai9'", label "b'Quad_voltage'", t_start = 0.01 and t_end = 28.204750000000008
2019-08-12 16:44:45,355 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with NO waits
2019-08-12 16:44:45,362 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extract_measurements i_start = 500, t_i = 0.01000007, i_end=1410237, t_f = 28.20474007
2019-08-12 16:44:45,362 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of times: (1409738,), shape of values: (1409738,), shape of data: (1409738,)
2019-08-12 16:44:45,362 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of raw_data: 1420000
2019-08-12 16:44:45,386 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with connection: "b'ai10'", label "b'Z_Bias_voltage'", t_start = 0.01 and t_end = 28.204750000000008
2019-08-12 16:44:45,386 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with NO waits
2019-08-12 16:44:45,393 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extract_measurements i_start = 500, t_i = 0.01000007, i_end=1410237, t_f = 28.20474007
2019-08-12 16:44:45,395 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of times: (1409738,), shape of values: (1409738,), shape of data: (1409738,)
2019-08-12 16:44:45,395 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of raw_data: 1420000
2019-08-12 16:44:45,417 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with connection: "b'ai11'", label "b'MOT_Fluorescence'", t_start = 0.01 and t_end = 28.204750000000008
2019-08-12 16:44:45,417 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with NO waits
2019-08-12 16:44:45,424 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extract_measurements i_start = 500, t_i = 0.01000007, i_end=1410237, t_f = 28.20474007
2019-08-12 16:44:45,426 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of times: (1409738,), shape of values: (1409738,), shape of data: (1409738,)
2019-08-12 16:44:45,426 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of raw_data: 1420000
2019-08-12 16:44:45,449 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with connection: "b'ai16'", label "b'IGC_voltage'", t_start = 0.01 and t_end = 28.204750000000008
2019-08-12 16:44:45,449 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with NO waits
2019-08-12 16:44:45,456 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extract_measurements i_start = 500, t_i = 0.01000007, i_end=1410237, t_f = 28.20474007
2019-08-12 16:44:45,459 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of times: (1409738,), shape of values: (1409738,), shape of data: (1409738,)
2019-08-12 16:44:45,459 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of raw_data: 1420000
2019-08-12 16:44:46,115 INFO BLACS.ni_pcie_6363_0_acquisition_worker.worker: data written, time taken: 0.7861306667327881s
2019-08-12 16:44:46,115 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: Job complete

Failed experiment:

2019-08-12 16:49:20,853 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extract_measurements
2019-08-12 16:49:20,868 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with connection: "b'ai9'", label "b'Quad_voltage'", t_start = 0.01 and t_end = 28.204750000000008
2019-08-12 16:49:20,868 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extracting measurement with NO waits
2019-08-12 16:49:20,874 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: extract_measurements i_start = 500, t_i = 0.01000007, i_end=1410237, t_f = 28.20474007
2019-08-12 16:49:20,874 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of times: (1409738,), shape of values: (1409500,), shape of data: (1409500,)
2019-08-12 16:49:20,874 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: shape of raw_data: 1410000
2019-08-12 16:49:20,884 ERROR BLACS.ni_pcie_6363_0_acquisition_worker.worker: Exception in job:
Traceback (most recent call last):
  File "C:\labscript_suite\labscript_devices\NI_DAQmx\blacs_workers.py", line 592, in transition_to_manual
    self.extract_measurements(raw_data, waits_in_use)
  File "C:\labscript_suite\labscript_devices\NI_DAQmx\blacs_workers.py", line 659, in extract_measurements
    data['t'] = times
ValueError: could not broadcast input array from shape (1409738) into shape (1409500)

2019-08-12 16:49:20,888 DEBUG BLACS.ni_pcie_6363_0_acquisition_worker.worker: Waiting for next job request
2019-08-12 16:49:20,894 INFO BLACS.ni_pcie_6363_0.mainloop: Worker reported exception during job
2019-08-12 16:49:20,896 DEBUG BLACS.ni_pcie_6363_0.mainloop: returning worker results to function _transition_to_manual
2019-08-12 16:49:20,914 DEBUG BLACS.queue_manager.thread: ni_pcie_6363_0 failed to transition to manual
2019-08-12 16:49:20,915 CRITICAL BLACS.ni_pcie_6363_0.mainloop: A fatal exception happened:
 Traceback (most recent call last):
  File "C:\labscript_suite\blacs\tab_base_classes.py", line 848, in mainloop
    next_yield = inmain(generator.send,results)
  File "C:\users\beclab\anaconda3\envs\labscript3.7\lib\site-packages\qtutils\invoke_in_main.py", line 103, in inmain
    return get_inmain_result(_in_main_later(fn, False, *args, **kwargs))
  File "C:\users\beclab\anaconda3\envs\labscript3.7\lib\site-packages\qtutils\invoke_in_main.py", line 164, in get_inmain_result
    _reraise(exception)
  File "C:\users\beclab\anaconda3\envs\labscript3.7\lib\site-packages\qtutils\invoke_in_main.py", line 36, in _reraise
    raise value.with_traceback(traceback)
  File "C:\users\beclab\anaconda3\envs\labscript3.7\lib\site-packages\qtutils\invoke_in_main.py", line 61, in event
    result = event.fn(*event.args, **event.kwargs)
  File "C:\labscript_suite\blacs\device_base_class.py", line 686, in transition_to_manual
    raise Exception('Could not transition to manual. You must restart this device to continue')
Exception: Could not transition to manual. You must restart this device to continue

2019-08-12 16:49:20,934 INFO BLACS.ni_pcie_6363_0.mainloop: Exiting

My logging code (starting line 627 of blacs_workers.py):

for connection, label, t_start, t_end, _, _, _ in acquisitions:
  self.logger.debug('extracting measurement with connection: "%s", label "%s", t_start = %s and t_end = %s'%(connection,label,t_start,t_end))
  connection = _ensure_str(connection)
  label = _ensure_str(label)
  if waits_in_use:
      self.logger.debug('extracting measurement with waits')
      # add durations from all waits that start prior to t_start of
      # acquisition
      t_start += wait_durations[(wait_times < t_start)].sum()
      # compare wait times to t_end to allow for waits during an
      # acquisition
      t_end += wait_durations[(wait_times < t_end)].sum()
  else:
      self.logger.debug('extracting measurement with NO waits')
  i_start = int(np.ceil(self.buffered_rate * (t_start - t0)))
  i_end = int(np.floor(self.buffered_rate * (t_end - t0)))
  # np.ceil does what we want above, but float errors can miss the
  # equality:
  if t0 + (i_start - 1) / self.buffered_rate - t_start > -2e-16:
      i_start -= 1
  # We want np.floor(x) to yield the largest integer < x (not <=):
  if t_end - t0 - i_end / self.buffered_rate < 2e-16:
      i_end -= 1
  t_i = t0 + i_start / self.buffered_rate
  t_f = t0 + i_end / self.buffered_rate
  times = np.linspace(t_i, t_f, i_end - i_start + 1, endpoint=True)
  values = raw_data[connection][i_start : i_end + 1]
  dtypes = [('t', np.float64), ('values', np.float32)]
  self.logger.debug('extract_measurements i_start = %s, t_i = %s, i_end=%s, t_f = %s'%(i_start, t_i, i_end, t_f))
  data = np.empty(len(values), dtype=dtypes)
  self.logger.debug('shape of times: %s, shape of values: %s, shape of data: %s'%(np.shape(times),np.shape(values),np.shape(data)))
  self.logger.debug('shape of raw_data: %s'%np.shape(raw_data))
  data['t'] = times
  data['values'] = values
  measurements.create_dataset(label, data=data)

Any thoughts?

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: philipstarkey, GitHub: philipstarkey).


So I'm guessing that it could be due to a few things:

We should probably put something in the code that catches (and handles) this nicely anyway, but I would be interested to see if it was possible to reference the AI clock to 10 MHz and if it fixed the problem (because in theory, the acquisition should run significantly longer than the shot, but it seems like it's only just about the same length)

FYI: I think you can do this by configuring the external reference clock to be fed in via a PFI. I'm unclear if you can configure this via MAX permanently or if you need to add in some daqmx calls to set the source and rate. It is not well documented in the manual!

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: philipstarkey, GitHub: philipstarkey).


It can be externally referenced! See edit above.

philipstarkey commented 5 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


Slices don't raise an IndexError if the final index is out of bounds, they just return what they got. Questionable design decision by Python maybe, but that's how it is, so that's probably why we don't see an error until later.

We can suppress the error just by truncating the time array to the length of however much data there is:

data['t'] = times[:len(values)]

And it's just the case that the data can't be trusted to have more accurate timing than what the card is capable of unless external referencing is implemented as Phil mentions. So the above is obviously a lie, but only because the measurements are inaccurate to that degree anyway, so it's not so bad.