microscope-cockpit / cockpit

Cockpit is a microscope graphical user interface. It is a flexible and easy to extend platform aimed at life scientists using bespoke microscopes.
https://microscope-cockpit.org
GNU General Public License v3.0
35 stars 26 forks source link

Crash on disk full in Experiment-cleanup thread #769

Open thomasmfish opened 3 years ago

thomasmfish commented 3 years ago

crash occurs in reorder_img_file function in the structuredIllumination.py file when trying to create a new tempfile

carandraug commented 3 years ago

Not sure how to handle the case of not enough disk space to save the data file. An error seems to be the right behaviour.

Why the tempfile: the original data file may be in the wrong order. To reorder, we create a tempfile with data in the correct order, and then replace the data file with the tempfile. I guess we could rewrite the original file, one frame at a time to avoid the duplicated file but then if something wrong happens we lose the original data. I think it's safer to only remove the original data file after the reorder is done successfully.

thomasmfish commented 3 years ago

I see how the tempfile is useful but maybe it might be better to show a warning and wait for disk space to be available before trying again? We also have an issue where the original data file was overwritten by the tempfile but the transfer missed the end of the file (presumably due to network issues). I'd suggest some exception handling around this function that would allow a user to check the status of the drives and make any changes before retrying.

thomasmfish commented 3 years ago

Ian also suggests that some louder warning to tell the user to stop and do something before collecting more data that could be corrupted may be useful

juliomateoslangerak commented 3 years ago

The OMX will warn if there is not enough space on disk to start the experiment. This is a real lifesaver with time-lapse experiments

thomasmfish commented 3 years ago

The issue we're having is that we're saving the data on a network drive but the tempfile, sensibly, is stored locally and there's no disk space check specifically for the tempfile (as far as I can tell).

iandobbie commented 3 years ago

I see two separate issues here.

1) not enough space to store data as collected. As Julio mentioned this is checked for and the experiment wont start without this.

2) Disk full (or other error) during the SIM file reorder process. This is a bit more complicated. Currently this crashes the experiment-cleanup thread and dumps info into the log file but doesn't provide good user feedback.

I suggest that the file writing in the reorder file functions is wrapped in a try/except and a failure either warns and allows the user to do something (delete data on disk?) before a retry, or maybe just generates a user facing dialogue and tells the users what file has not been reordered and not to continue until the issue is solved.