dotnet / machinelearning-modelbuilder

Simple UI tool to build custom machine learning models.
Creative Commons Attribution 4.0 International
264 stars 56 forks source link

Error during training: Exception thrown in writing when running out of disk space #1538

Open torronen opened 3 years ago

torronen commented 3 years ago

System Information (please complete the following information):

Describe the bug When starting the training of a binary text classification, error "Exception thrown in writing" is returned before any iterations complete.

To Reproduce Unknown. I will update here if issue happens again. Issue may be related to low disk space on system disk (<1 Gb), but it did not reappear on the subsequent run.

Expected behavior The training should start as expected OR the error should specify what is the problem and what user could do to fix the issue.

Screenshots image

torronen commented 3 years ago

Another suspect: Potentially could be also related to running out of free RAM. Restart cleared up RAM (by dismounting RAMDisk, in this case) for successful training.

torronen commented 3 years ago

Another machine encountered the same. It also had low disk space (< 2gb free after the error), and the dataset size exceeds the free disk space. So, possibly, the error message would just need some clarification about what could be wrong, and perhaps suggestion to check free disk space and disk status.

LittleLittleCloud commented 3 years ago

If the size of your dataset is X and the space left on your disk is smaller than 2X then you might have a writing error.

This is because ModelBuilder will save two copies of dataset in the temporary folder, one copy for hold-out training and the other for cross-validation training. You could find those datasets under %temp%/AutoML-NNI folder, which has a 6-digit name.

@briacht we might need a more explainable error message when writing error appears, can you come up with some?

torronen commented 3 years ago

Minor improvement suggestion: Could it reserve disk space upon start, and then give exception upon start if no free space? For example, 7zip does that and it's been pretty handy when unpacking datasets on limited spaces. Especially VM's seem to often have not much extra space. For bigger datasets, it may take well over 10 minutes to process the file and then give this error. The user may have already logged out. This might not apply for most users, though, I suppose.