TheDeanLab / navigate

navigate - open source light-sheet microscope controls
https://thedeanlab.github.io/navigate/
Other
23 stars 6 forks source link

Write fails when we hit maximum disk space #622

Open zacsimile opened 12 months ago

zacsimile commented 12 months ago

I know we check for disk space before starting an acquisition (#507), but it seems this is insufficient. An example of where this approach fails is copying data to the saving directory while the acquisition is running. In this case, the available space shrinks mid-acquisition with no alert. I think we need to wrap the write call in a try/except block and gracefully close when we are out of disk space. Unfortunately, closing requires a slight amount of disk space, so this may fail too. Still not quite sure how to handle this, but I think it's work just giving the try/except a go. Perhaps another approach would be to "look ahead" one frame and cut off acquisition early. But, this sort of prediction may get us into trouble in the exact same example I provided above.

Traceback (most recent call last):
  File "h5py\_objects.pyx", line 201, in h5py._objects.ObjectID.__dealloc__
OSError: [Errno 28] Driver write request failed (file write failed: time = Tue Sep 19 18:46:15 2023
, filename = 'D:\kartik\091923braintest\NA\cfos-AF647_autofluorescence488\2023-09-19\Cell_001\CH00_000000.h5', file descriptor = 4, errno = 28, error message = 'No space left on device', buf = 00000195AF3D3798, total write size = 4096, bytes this sub-write = 4096, bytes actually written = 18446744073709551615, offset = 146627814192)
Exception ignored in: 'h5py._objects.ObjectID.__dealloc__'
Traceback (most recent call last):
  File "h5py\_objects.pyx", line 201, in h5py._objects.ObjectID.__dealloc__
OSError: [Errno 28] Driver write request failed (file write failed: time = Tue Sep 19 18:46:15 2023
, filename = 'D:\kartik\091923braintest\NA\cfos-AF647_autofluorescence488\2023-09-19\Cell_001\CH00_000000.h5', file descriptor = 4, errno = 28, error message = 'No space left on device', buf = 00000195AF3D3798, total write size = 4096, bytes this sub-write = 4096, bytes actually written = 18446744073709551615, offset = 146627814192)
Exception in thread z-stack Data:
Traceback (most recent call last):
  File "c:\users\dean-lab\desktop\aslm\src\aslm\model\features\image_writer.py", line 251, in save_image
    self.data_source.write(
  File "c:\users\dean-lab\desktop\aslm\src\aslm\model\data_sources\bdv_data_source.py", line 143, in write
    self.image[dataset_name][zs, ...] = data[::dy, ::dx].astype(np.int16)
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\site-packages\h5py\_hl\dataset.py", line 999, in __setitem__
    self.id.write(mspace, fspace, val, mtype, dxpl=self._dxpl)
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py\h5d.pyx", line 283, in h5py.h5d.DatasetID.write
  File "h5py\_proxy.pyx", line 114, in h5py._proxy.dset_rw
OSError: [Errno 28] Can't write data (file write failed: time = Tue Sep 19 18:46:15 2023
, filename = 'D:\kartik\091923braintest\NA\cfos-AF647_autofluorescence488\2023-09-19\Cell_001\CH00_000000.h5', file descriptor = 4, errno = 28, error message = 'No space left on device', buf = 00000195AF3D1778, total write size = 4096, bytes this sub-write = 4096, bytes actually written = 18446744073709551615, offset = 146628789040)

Traceback (most recent call last):
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\threading.py", line 973, in _bootstrap_inner
    self.run()
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "c:\users\dean-lab\desktop\aslm\src\aslm\model\model.py", line 775, in run_data_process
    data_func(frame_ids)
  File "c:\users\dean-lab\desktop\aslm\src\aslm\model\features\image_writer.py", line 283, in save_image
    self.close()
  File "c:\users\dean-lab\desktop\aslm\src\aslm\model\features\image_writer.py", line 346, in close
    self.data_source.close()
  File "c:\users\dean-lab\desktop\aslm\src\aslm\model\data_sources\bdv_data_source.py", line 259, in close
    self.image.close()
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\site-packages\h5py\_hl\files.py", line 585, in close
    self.id._close_open_objects(h5f.OBJ_LOCAL | ~h5f.OBJ_FILE)
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py\h5f.pyx", line 360, in h5py.h5f.FileID._close_open_objects
OSError: [Errno 28] Driver write request failed (file write failed: time = Tue Sep 19 18:46:15 2023
, filename = 'D:\kartik\091923braintest\NA\cfos-AF647_autofluorescence488\2023-09-19\Cell_001\CH00_000000.h5', file descriptor = 4, errno = 28, error message = 'No space left on device', buf = 00000195AF3CA708, total write size = 4096, bytes this sub-write = 4096, bytes actually written = 18446744073709551615, offset = 146628793136)
camera thread exception happened!  Traceback (most recent call last):
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 317, in _recv_bytes
    nread, err = ov.GetOverlappedResult(True)
BrokenPipeError: [WinError 109] The pipe has been ended

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 175, in func
    target(*args, **kwargs)
  File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\controller.py", line 821, in capture_image
    image_id = self.show_img_pipe.recv()
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 255, in recv
    buf = self._recv_bytes()
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 326, in _recv_bytes
    raise EOFError
EOFError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 64, in run
    self._target(*self._args, **self._kwargs)
  File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 183, in func
    logger.debug(
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1434, in debug
    self._log(DEBUG, msg, args, **kwargs)
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1589, in _log
    self.handle(record)
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1599, in handle
    self.callHandlers(record)
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1661, in callHandlers
    hdlr.handle(record)
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 948, in handle
    rv = self.filter(record)
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 806, in filter
    result = f.filter(record)
  File "c:\users\dean-lab\desktop\aslm\src\aslm\log_files\filters.py", line 81, in filter
    if record.getMessage().startswith("Performance"):
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 367, in getMessage
    msg = msg % self.args
TypeError: not all arguments converted during string formatting
zacsimile commented 12 months ago

OK, hmm. It is wrapped in try/except already. I have to look more into this.

zacsimile commented 12 months ago

It looks like the except is not receiving an OSError. I'm wondering if this error is caught at a lower level and not passed up to where the except call is. It's very hard to prove this as many of the h5py files are compiled and hard to test as we usually have plenty of disk space. I think we need to write a test using pyfakefs (https://pytest-pyfakefs.readthedocs.io/en/latest/usage.html) (see https://stackoverflow.com/questions/70887330/mocking-disk-out-of-space-in-python-unittests and https://stackoverflow.com/questions/19672138/how-do-i-mock-the-filesystem-in-python-unit-tests?rq=4) to simulate a small filesystem and then we can see which errors are getting passed up to the ImageWriter class.

zacsimile commented 10 months ago

It's back.

Traceback (most recent call last):
  File "c:\users\dean-lab\desktop\aslm\src\aslm\model\features\image_writer.py", line 261, in save_image
    self.data_source.write(
  File "c:\users\dean-lab\desktop\aslm\src\aslm\model\data_sources\bdv_data_source.py", line 386, in write
    self.image[dataset_name][zs, ...] = data[::dy, ::dx].astype(np.int16)
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\site-packages\h5py\_hl\dataset.py", line 999, in __setitem__
    self.id.write(mspace, fspace, val, mtype, dxpl=self._dxpl)
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py\h5d.pyx", line 283, in h5py.h5d.DatasetID.write
  File "h5py\_proxy.pyx", line 114, in h5py._proxy.dset_rw
OSError: [Errno 28] Can't write data (file write failed: time = Wed Nov  8 18:37:40 2023
, filename = 'D:\Nicole\Liver\NA\488myosin_561nuclear_647rfp\2023-11-08\Cell_001\CH00_000000.h5', file descriptor = 4, errno = 28, error message = 'No space left on device', buf = 00000199E5A37018, total write size = 16384, bytes this sub-write = 16384, bytes actually written = 18446744073709551615, offset = 162916425424)
AdvancedImagingUTSW commented 10 months ago

Here is my latest traceback...

``` exposure_time Configuration Failed 0.0051404 0.005235545112781955 Coupled axes: None exposure_time Configuration Failed 0.0051404 0.005235545112781955 error message: DCAMERR.TIMEOUT error message: DCAMERR.TIMEOUT error message: DCAMERR.TIMEOUT error message: DCAMERR.TIMEOUT error message: DCAMERR.TIMEOUT error message: DCAMERR.TIMEOUT error message: DCAMERR.TIMEOUT error message: DCAMERR.TIMEOUT Exception in thread z-stack Data: Traceback (most recent call last): File "c:\users\dean-lab\desktop\aslm\src\aslm\model\features\image_writer.py", line 261, in save_image self.data_source.write( File "c:\users\dean-lab\desktop\aslm\src\aslm\model\data_sources\bdv_data_source.py", line 388, in write self.image[dataset_name][zs, ...] = data[::dy, ::dx].astype(np.int16) File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\site-packages\h5py\_hl\dataset.py", line 999, in __setitem__ self.id.write(mspace, fspace, val, mtype, dxpl=self._dxpl) File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5d.pyx", line 283, in h5py.h5d.DatasetID.write File "h5py\_proxy.pyx", line 114, in h5py._proxy.dset_rw OSError: [Errno 28] Can't write data (file write failed: time = Tue Nov 14 13:20:09 2023 , filename = 'D:\Kevin\Lung\AC3332s3\SYTOXGreen_DsRed-CF555\2023-11-14\Cell_001\CH00_000000.h5', file descriptor = 4, errno = 28, error message = 'No space left on device', buf = 000001DEFBA9F198, total write size = 4096, bytes this sub-write = 4096, bytes actually written = 18446744073709551615, offset = 4179296) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\threading.py", line 973, in _bootstrap_inner self.run() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\threading.py", line 910, in run self._target(*self._args, **self._kwargs) File "c:\users\dean-lab\desktop\aslm\src\aslm\model\model.py", line 799, in run_data_process data_func(frame_ids) File "c:\users\dean-lab\desktop\aslm\src\aslm\model\features\image_writer.py", line 294, in save_image self.close() File "c:\users\dean-lab\desktop\aslm\src\aslm\model\features\image_writer.py", line 355, in close self.data_source.close() File "c:\users\dean-lab\desktop\aslm\src\aslm\model\data_sources\bdv_data_source.py", line 551, in close self.image.close() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\site-packages\h5py\_hl\files.py", line 585, in close self.id._close_open_objects(h5f.OBJ_LOCAL | ~h5f.OBJ_FILE) File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5f.pyx", line 360, in h5py.h5f.FileID._close_open_objects OSError: [Errno 28] Driver write request failed (file write failed: time = Tue Nov 14 13:20:09 2023 , filename = 'D:\Kevin\Lung\AC3332s3\SYTOXGreen_DsRed-CF555\2023-11-14\Cell_001\CH00_000000.h5', file descriptor = 4, errno = 28, error message = 'No space left on device', buf = 000001DEFBAA51F8, total write size = 4096, bytes this sub-write = 4096, bytes actually written = 18446744073709551615, offset = 4183392) camera thread exception happened! Traceback (most recent call last): File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 317, in _recv_bytes nread, err = ov.GetOverlappedResult(True) BrokenPipeError: [WinError 109] The pipe has been ended During handling of the above exception, another exception occurred: Traceback (most recent call last): File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 251, in func target(*args, **kwargs) File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\controller.py", line 865, in capture_image image_id = self.show_img_pipe.recv() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 255, in recv buf = self._recv_bytes() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 326, in _recv_bytes raise EOFError EOFError camera thread ended because of exception!: not all arguments converted during string formatting Traceback (most recent call last): File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 317, in _recv_bytes nread, err = ov.GetOverlappedResult(True) BrokenPipeError: [WinError 109] The pipe has been ended During handling of the above exception, another exception occurred: Traceback (most recent call last): File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 251, in func target(*args, **kwargs) File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\controller.py", line 865, in capture_image image_id = self.show_img_pipe.recv() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 255, in recv buf = self._recv_bytes() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 326, in _recv_bytes raise EOFError EOFError During handling of the above exception, another exception occurred: Traceback (most recent call last): File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 97, in run self._target(*self._args, **self._kwargs) File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 259, in func logger.debug( File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1434, in debug self._log(DEBUG, msg, args, **kwargs) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1589, in _log self.handle(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1599, in handle self.callHandlers(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1661, in callHandlers hdlr.handle(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 948, in handle rv = self.filter(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 806, in filter result = f.filter(record) File "c:\users\dean-lab\desktop\aslm\src\aslm\log_files\filters.py", line 95, in filter if record.getMessage().startswith("Performance"): File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 367, in getMessage msg = msg % self.args TypeError: not all arguments converted during string formatting Exception in thread camera: Traceback (most recent call last): File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 317, in _recv_bytes nread, err = ov.GetOverlappedResult(True) BrokenPipeError: [WinError 109] The pipe has been ended During handling of the above exception, another exception occurred: Traceback (most recent call last): File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 251, in func target(*args, **kwargs) File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\controller.py", line 865, in capture_image image_id = self.show_img_pipe.recv() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 255, in recv buf = self._recv_bytes() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 326, in _recv_bytes raise EOFError EOFError During handling of the above exception, another exception occurred: Traceback (most recent call last): File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 97, in run self._target(*self._args, **self._kwargs) File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 259, in func logger.debug( File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1434, in debug self._log(DEBUG, msg, args, **kwargs) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1589, in _log self.handle(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1599, in handle self.callHandlers(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1661, in callHandlers hdlr.handle(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 948, in handle rv = self.filter(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 806, in filter result = f.filter(record) File "c:\users\dean-lab\desktop\aslm\src\aslm\log_files\filters.py", line 95, in filter if record.getMessage().startswith("Performance"): File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 367, in getMessage msg = msg % self.args TypeError: not all arguments converted during string formatting During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\threading.py", line 973, in _bootstrap_inner self.run() File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\thread_pool.py", line 103, in run logger.debug( File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1434, in debug self._log(DEBUG, msg, args, **kwargs) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1589, in _log self.handle(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1599, in handle self.callHandlers(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 1661, in callHandlers hdlr.handle(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 948, in handle rv = self.filter(record) File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 806, in filter result = f.filter(record) File "c:\users\dean-lab\desktop\aslm\src\aslm\log_files\filters.py", line 95, in filter if record.getMessage().startswith("Performance"): File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\logging\__init__.py", line 367, in getMessage msg = msg % self.args TypeError: not all arguments converted during string formatting Exception in Tkinter callback Traceback (most recent call last): File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\tkinter\__init__.py", line 1892, in __call__ return self.func(*args) File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\sub_controllers\acquire_bar_controller.py", line 486, in exit_program self.parent_controller.execute("exit") File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\controller.py", line 774, in execute self.sloppy_stop() File "c:\users\dean-lab\desktop\aslm\src\aslm\controller\controller.py", line 815, in sloppy_stop self.model.run_command("stop") File "c:\users\dean-lab\desktop\aslm\src\aslm\model\concurrency\concurrency_tools.py", line 406, in __getattr__ attr = _get_response(self) File "c:\users\dean-lab\desktop\aslm\src\aslm\model\concurrency\concurrency_tools.py", line 429, in _get_response resp, printed_output = object_in_subprocess._.parent_pipe.recv() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 255, in recv buf = self._recv_bytes() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\multiprocessing\connection.py", line 310, in _recv_bytes waitres = _winapi.WaitForMultipleObjects( KeyboardInterrupt Traceback (most recent call last): File "C:\Users\Dean-Lab\miniconda3\envs\aslm\Scripts\aslm-script.py", line 33, in sys.exit(load_entry_point('aslm', 'console_scripts', 'aslm')()) File "c:\users\dean-lab\desktop\aslm\src\aslm\main.py", line 115, in main root.mainloop() File "C:\Users\Dean-Lab\miniconda3\envs\ASLM\lib\tkinter\__init__.py", line 1429, in mainloop self.tk.mainloop(n) KeyboardInterrupt (aslm) C:\Users\Dean-Lab> ```