OSError: [WinError 1455] The paging file is too small for this operation to complete. #23

Closed alexhiggins732 closed 2 years ago

alexhiggins732 commented 2 years ago

Loading the MNIST Classifier and attempting to build the model results in the following error.

return _run_code(code, main_globals, None, File "C:\Users\WDAGUtilityAccount\TorchStudio\python\lib\", line 87, in _run_code exec(code, run_globals) File "C:\Users\WDAGUtilityAccount\TorchStudio\torchstudio\", line 4, in import torch File "C:\Users\WDAGUtilityAccount\TorchStudio\python\lib\site-packages\", line 124, in raise err OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "C:\Users\WDAGUtilityAccount\TorchStudio\python\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.


Windows 10 Enterprise 2004 build 19041.1526 image

Hardware: image


23:17:41.862 INFO  starting TorchStudio 0.9.6 on "Windows 10 Version 2004"
23:30:59.915 INFO  python environment install:
 Downloading, installing and setting up a Python environment for TorchStudio.
This may take up to 15 minutes depending on your download speed, and up to 16GB.

Downloading Python installer (Miniconda3)...
Download complete: C:\Users\WDAGUtilityAccount\TorchStudio\python-miniconda3.exe

Installing Python in C:\Users\WDAGUtilityAccount\TorchStudio\python...
Installation complete.

Downloading and installing PyTorch and additional packages:
pytorch torchvision torchaudio cudatoolkit=11.3 datasets scipy pandas matplotlib-base python-graphviz paramiko pysoundfile

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

Python environment ready.

23:31:04.005 INFO  python check: "Checking Python version...\n\nChecking required packages...\n\nLoading PyTorch...\n\nListing devices...\n\n" "Functional environment (Python 3.9, PyTorch 1.10, Devices: CPU (cpu))\n"
23:31:04.005 WARNING  QIODevice::read (QFile, "C:\Users\WDAGUtilityAccount\TorchStudio\servers.json"): device not open
23:31:05.819 INFO  checking notifications...
23:31:05.819 INFO  inference: "Loading PyTorch...\n\n" ""
23:31:07.505 INFO  inference: "Traceback (most recent call last):\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\\", line 197, in _run_module_as_main\n" ""
23:31:07.505 INFO  inference: "    return _run_code(code, main_globals, None,\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\\", line 87, in _run_code\n" ""
23:31:07.505 INFO  inference: "    exec(code, run_globals)\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\torchstudio\\\", line 5, in <module>\n" ""
23:31:07.505 INFO  inference: "    import torch\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\site-packages\\torch\\\", line 124, in <module>\n" ""
23:31:07.505 INFO  inference: "    raise err\nOSError" ""
23:31:07.505 INFO  inference: ": [WinError 1455] The paging file is too small for this operation to complete. Error loading \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\site-packages\\torch\\lib\\cudnn_adv_train64_8.dll\" or one of its dependencies.\n" ""
23:31:07.598 INFO  inference end: "" ""
23:33:13.130 INFO  datasetload: "Loading PyTorch...\n\n" ""
23:33:15.083 INFO  datasetload: "Dataset script connected\n\n" ""
23:33:15.083 INFO  datasetload: "Loading dataset...\n\nTraceback (most recent call last):\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\\", line 197, in _run_module_as_main\n    return _run_code(code, main_globals, None,\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\\", line 87, in _run_code\n    exec(code, run_globals)\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\torchstudio\\\", line 141, in <module>\n    sample=meta_dataset.train()[0]\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\torchstudio\\\", line 79, in __getitem__\n    raise IndexError\nIndexError\n" ""
23:33:15.145 INFO  datasetload disconnected
23:33:15.270 INFO  datasetload end: "" ""
23:33:39.926 INFO  datasetload: "Loading PyTorch...\n\n" ""
23:33:42.661 INFO  datasetload: "Dataset script connected\n\n" ""
23:33:42.661 INFO  datasetload: "Loading dataset...\n\nTraceback (most recent call last):\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\\", line 197, in _run_module_as_main\n    return _run_code(code, main_globals, None,\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\\", line 87, in _run_code\n" ""
23:33:42.661 INFO  datasetload: "    exec(code, run_globals)\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\torchstudio\\\", line 141, in <module>\n    sample=meta_dataset.train()[0]\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\torchstudio\\\", line 79, in __getitem__\n    raise IndexError\nIndexError\n" ""
23:33:42.723 INFO  datasetload disconnected
23:33:42.864 INFO  datasetload end: "" ""
23:34:06.083 INFO  datasetload: "Loading PyTorch...\n\n" ""
23:34:07.661 INFO  datasetload: "Dataset script connected\n\n" ""
23:34:07.661 INFO  datasetload: "Loading dataset...\n\n" ""
23:34:07.739 INFO  datasetload: "" "Loading complete\n"
23:34:21.489 INFO  datasetanalyze: "Analyze script connected\n\n" ""
23:34:21.489 INFO  datasetanalyze: "Setting analyzer code...\n\n" ""
23:34:21.645 INFO  datasetanalyze: "File C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\site-packages\\torch\\, line 124, in <module>\nraise err\nOSError: 22\n" ""
23:34:33.239 WARNING  QProcess: Destroyed while process ("C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\python") is still running.
23:34:33.239 INFO  datasetanalyze end: "" ""
23:34:33.255 INFO  datasetanalyze disconnected
23:34:33.255 WARNING  QCoreApplication::postEvent: Unexpected null receiver
23:34:37.661 INFO  datasetanalyze: "Analyze script connected\n\n" ""
23:34:37.661 INFO  datasetanalyze: "Setting analyzer code...\n\n" ""
23:34:38.473 INFO  datasetanalyze: "Analyzing...\n\n" ""
23:34:38.473 INFO  datasetload: "Connecting to analyzer...\n\n" ""
23:34:38.473 INFO  datasetload: "\rSending samples to analyzer...:   0%|          | ? left\n\n" ""
23:34:38.473 INFO  datasetanalyze: "\rAnalyzing...:   0%|          | ? left\n\n" ""
23:34:38.536 INFO  datasetload: "\rSending samples to analyzer...: 100%|\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88| 00:00 left\n\n\n" ""
23:34:38.552 INFO  datasetanalyze: "\rAnalyzing...: 100%|\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88\xE2\x96\x88| 00:00 left\n\n\n" ""
23:34:38.552 INFO  datasetanalyze: "" "Analysis complete\n"
23:34:38.552 INFO  datasetload: "" "Samples transfer to analyzer completed\n"
23:35:25.551 WARNING  QProcess: Destroyed while process ("C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\python") is still running.
23:35:25.567 INFO  code parser disconnected
23:35:25.567 WARNING  QCoreApplication::postEvent: Unexpected null receiver
23:35:28.723 WARNING  QProcess: Destroyed while process ("C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\python") is still running.
23:35:28.723 INFO  datasetanalyze end: "" ""
23:35:28.723 INFO  datasetanalyze disconnected
23:35:28.723 WARNING  QCoreApplication::postEvent: Unexpected null receiver
23:35:29.005 INFO  datasetanalyze: "Analyze script connected\n\n" ""
23:35:29.005 INFO  datasetanalyze: "Setting analyzer code...\n\n" ""
23:35:29.239 INFO  datasetanalyze: "File C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\site-packages\\torch\\, line 124, in <module>\nraise err\nOSError: 22\n" ""
23:36:03.509 WARNING  QProcess: Destroyed while process ("C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\python") is still running.
23:36:03.509 INFO  datasetanalyze end: "" ""
23:36:03.509 INFO  datasetanalyze disconnected
23:36:03.509 WARNING  QCoreApplication::postEvent: Unexpected null receiver
23:36:04.708 INFO  datasetanalyze: "Analyze script connected\n\n" ""
23:36:04.708 INFO  datasetanalyze: "Setting analyzer code...\n\n" ""
23:36:04.864 INFO  datasetanalyze: "File C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\site-packages\\torch\\, line 124, in <module>\nraise err\nOSError: 22\n" ""
23:36:53.052 WARNING  QProcess: Destroyed while process ("C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\python") is still running.
23:36:53.052 INFO  code parser disconnected
23:36:53.052 WARNING  QCoreApplication::postEvent: Unexpected null receiver
23:37:55.677 INFO  build: "Model 1" "Loading PyTorch...\n\n" ""
23:37:55.911 INFO  build: "Model 1" "Traceback (most recent call last):\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\\", line 197, in _run_module_as_main\n" ""
23:37:55.911 INFO  build: "Model 1" "    return _run_code(code, main_globals, None,\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\\", line 87, in _run_code\n    exec(code, run_globals)\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\torchstudio\\\", line 4, in <module>\n    import torch\n  File \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\site-packages\\torch\\\", line 124, in <module>\n    raise err\nOSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading \"C:\\Users\\WDAGUtilityAccount\\TorchStudio\\python\\lib\\site-packages\\torch\\lib\\caffe2_detectron_ops_gpu.dll\" or one of its dependencies.\n" ""
23:37:55.926 INFO  build end: "Model 1" "" ""
divideconcept commented 2 years ago

This is unfortunately a known PyTorch+CUDA/GPU issue on Windows, on some configurations (like yours, with only 4GB of RAM) it sometime fails when more than one torch instance is loaded. There are some workarounds though, see the link above. But this is beyond the scope of TorchStudio.

divideconcept commented 2 years ago

This will be fixed with the next version of PyTorch (1.13) which will be compatible with the latest version of CUDA (11.7) which solves this issue.