minimaxir / aitextgen

A robust Python tool for text-based AI training and generation using GPT-2.
https://docs.aitextgen.io
MIT License
1.84k stars 218 forks source link

[WinError 1455] The paging file is too small for this operation to complete. #188

Closed GucciFlipFlops1917 closed 2 years ago

GucciFlipFlops1917 commented 2 years ago

Hardware Specs: CPU: AMD Ryzen 9 5900HX - 3.30 GHz RAM: 16GB GPU: NVidia RTX 3070 VRAM: 8GB

Full Error Prompt: [WinError 1455] The paging file is too small for this operation to complete. Error loading “C:\Users\[user]\anaconda3\envs\[env name]\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll” or one of its dependencies.

I received this torch error when attempting to train even the smallest model with the lowest batch_size and a block_size.

GucciFlipFlops1917 commented 2 years ago

Troubleshooting: Naturally, the first thing I did was check the handling of my paging file under Advanced Settings>Performance Settings>Advanced>Virtual Memory

But lo and behold, it was already set to system managed with access to ~40GB. Note: If it is not set to system managed or your storage space is too low, this may be a potential fix.

Solution for Excessively Large Paging Files: In my opinion, it's a bit ridiculous that torch requires an excess of 40GB storage just to load the requisite files for startup. Thankfully, @cobryan05 comes to the rescue with this script: https://gist.github.com/cobryan05/7d1fe28dd370e110a372c4d268dcb2e5

Simply download it and run the following: python fixNvPe.py --input=C:\Users\[user]\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\lib\*.dll This also depends on where you have torch installed. Change --input=[location of torch\lib\*dll] as needed.

Possible locations: C:\Users\[user]\anaconda3\pkgs\cudatoolkit-[version number]\Library\bin\*.dll C:\Users\[user]\anaconda3\Lib\site-packages\torch\lib\*.dll C:\Users\[user]\anaconda3\envs\[env name]\Lib\site-packages\torch\lib\*.dll C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v[version #]\bin\*.dll

Example .dll files in the location may include: asmjit.dll c10.dll caffe2_detectron_ops.dll fbgemm.dll libiomp5md.dll shm.dll torch.dll uv.dll cublas64_11.dll cudadevrt.lib cudart64_110.dll cufft64_10.dll cusparse64_11.dll nppc64_11.dll nppial64_11.dll nvrtc-builtins64_110.dll