openvinotoolkit / training_extensions

Train, Evaluate, Optimize, Deploy Computer Vision Models via OpenVINO™
https://openvinotoolkit.github.io/training_extensions/
Apache License 2.0
1.14k stars 442 forks source link

CUDA out of memory? #1035

Closed raymondlo84 closed 2 years ago

raymondlo84 commented 2 years ago

What is the minimal requirements for running the sample code?

https://github.com/openvinotoolkit/training_extensions/blob/develop/QUICK_START_GUIDE.md

image

RuntimeError: CUDA out of memory. Tried to allocate 40.00 MiB (GPU 0; 5.79 GiB total capacity; 4.19 GiB already allocated; 48.31 MiB free; 4.22 GiB reserved in total by PyTorch)

Describe the bug

Steps to Reproduce

  1. Run sample code and see screenshto above

Environment:

raymond@raymond-850XBC:~/Documents/training_extensions/ote_cli/notebooks$ uname -ar Linux raymond-850XBC 5.13.0-39-generic #44~20.04.1-Ubuntu SMP Thu Mar 24 16:43:35 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

raymond@raymond-850XBC:~/Documents/training_extensions/ote_cli/notebooks$ ~/Documents/cuda-samples/bin/x86_64/linux/release/deviceQuery /home/raymond/Documents/cuda-samples/bin/x86_64/linux/release/deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce RTX 2060" CUDA Driver Version / Runtime Version 11.6 / 11.1 CUDA Capability Major/Minor version number: 7.5 Total amount of global memory: 5926 MBytes (6214189056 bytes) (30) Multiprocessors, ( 64) CUDA Cores/MP: 1920 CUDA Cores GPU Max Clock rate: 1200 MHz (1.20 GHz) Memory Clock rate: 7001 Mhz Memory Bus Width: 192-bit L2 Cache Size: 3145728 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total shared memory per multiprocessor: 65536 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 1024 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 3 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Managed Memory: Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.6, CUDA Runtime Version = 11.1, NumDevs = 1 Result = PASS raymond@raymond-850XBC:~/Documents/training_extensions/ote_cli/notebooks$

ENVIRONMENT: sys.platform: linux Python: 3.8.10 (default, Mar 15 2022, 12:22:08) [GCC 9.4.0] CUDA available: True GPU 0: NVIDIA GeForce RTX 2060 CUDA_HOME: /usr/local/cuda NVCC: Build cuda_11.1.TC455_06.29190527_0 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 PyTorch: 1.8.2+cu111 PyTorch compiling details: PyTorch built with:

TorchVision: 0.9.2+cu111 OpenCV: 4.5.3 MMCV: 1.3.14 MMCV Compiler: GCC 9.4 MMCV CUDA Compiler: 11.1 MMDetection: 2.9.0+606a172 MMDetection Compiler: GCC 9.4 MMDetection CUDA Compiler: 11.1 NNCF: 2.1.0 ONNX: 1.10.1 ONNXRuntime: 1.9.0 OpenVINO MO: None OpenVINO IE: 2022.1.0-7019-cdb9bec7210-releases/2022/1 pip list: WARNING: You are using pip version 21.2.1; however, version 22.0.4 is available. You should consider upgrading via the '/home/raymond/Documents/training_extensions/cur_task_venv/bin/python3.8 -m pip install --upgrade pip' command. 2022-03-31 19:38:48,520 - mmdet - INFO - Scratch space created at /tmp/ote-det-scratch-_8u7wfh6 Package Version Location


absl-py 0.15.0 addict 2.4.0 albumentations 0.4.6 antlr4-python3-runtime 4.8 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 asynctest 0.13.0 attrs 21.2.0 backcall 0.2.0 bayesian-optimization 1.2.0 beautifulsoup4 4.10.0 bleach 4.1.0 cachetools 4.2.4 certifi 2021.10.8 cffi 1.15.0 charset-normalizer 2.0.12 click 8.0.3 codecov 2.1.12 colorama 0.4.4 coverage 6.0.2 cycler 0.10.0 Cython 0.29.24 debugpy 1.6.0 decorator 4.4.2 defusedxml 0.7.1 detection-tasks 0.0.0 /home/raymond/Documents/training_extensions/external/mmdetection editdistance 0.6.0 entrypoints 0.4 fast_ctc_decode 0.3.0 filelock 3.6.0 flake8 4.0.1 flatbuffers 1.12 google-auth 1.35.0 google-auth-oauthlib 0.4.6 grpcio 1.32.0 hpopt 0.1.0 huggingface-hub 0.4.0 idna 2.10 imagecodecs 2022.2.22 imagecorruptions 1.1.2 imageio 2.9.0 imagesize 1.2.0 imgaug 0.4.0 iniconfig 1.1.1 interrogate 1.5.0 ipykernel 6.11.0 ipython 7.31.1 ipython-genutils 0.2.0 isort 4.3.21 jedi 0.18.1 Jinja2 3.1.1 joblib 1.1.0 jsonschema 3.2.0 jstyleson 0.0.2 jupyter-client 7.2.0 jupyter-core 4.9.2 jupyterlab-pygments 0.1.2 kiwisolver 1.3.2 kwarray 0.5.19 lmdb 1.3.0 Markdown 3.3.4 MarkupSafe 2.1.1 matplotlib 3.4.3 matplotlib-inline 0.1.3 mccabe 0.6.1 mistune 0.8.4 mmcv-full 1.3.14 mmdet 2.9.0 /home/raymond/Documents/training_extensions/external/mmdetection/submodule mmpycocotools 12.0.3 natsort 7.1.1 nbclient 0.5.13 nbconvert 6.4.5 nbformat 5.2.0 nbmake 1.3.0 nest-asyncio 1.5.4 networkx 2.6.3 nibabel 3.2.1 ninja 1.10.2.2 nltk 3.6.5 nncf 2.1.0.dev0+46424420 notebook 6.4.10 numpy 1.19.5 oauthlib 3.1.1 omegaconf 2.1.1 onnx 1.10.1 onnxoptimizer 0.2.6 onnxruntime 1.9.0 opencv-python 4.5.3.56 openmodelzoo-modelapi 0.0.0 openvino 2022.1.0 openvino-dev 2022.1.0 openvino-telemetry 2022.1.1 ordered-set 4.0.2 ote-cli 0.2 /home/raymond/Documents/training_extensions/ote_cli OTE-SDK 1.0 /home/raymond/Documents/training_extensions/ote_sdk packaging 21.0 pandas 1.1.5 pandocfilters 1.5.0 parasail 1.2.4 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 8.4.0 pip 21.2.1 pkg_resources 0.0.0 pluggy 1.0.0 progress 1.6 prometheus-client 0.13.1 prompt-toolkit 3.0.28 protobuf 3.14.0 psutil 5.9.0 ptyprocess 0.7.0 py 1.10.0 py-cpuinfo 8.0.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pyclipper 1.3.0.post2 pycodestyle 2.8.0 pycparser 2.21 pydantic 1.9.0 pydicom 2.2.2 pydot 1.4.2 pyflakes 2.4.0 Pygments 2.11.2 pymongo 3.12.0 pyparsing 2.4.7 pyrsistent 0.18.0 pytest 6.2.5 pytest-ordering 0.6 python-dateutil 2.8.1 pytorchcv 0.0.55 pytz 2021.3 PyWavelets 1.1.1 PyYAML 5.4.1 pyzmq 22.3.0 rawpy 0.16.0 regex 2021.10.21 requests 2.26.0 requests-oauthlib 1.3.0 rsa 4.7.2 sacremoses 0.0.49 scikit-image 0.17.2 scikit-learn 0.24.2 scipy 1.5.4 Send2Trash 1.8.0 sentencepiece 0.1.96 setuptools 61.3.0 Shapely 1.8.0 six 1.15.0 sklearn 0.0 soupsieve 2.3.1 sty 1.0.0b12 tabulate 0.8.9 tensorboard 2.7.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.0 terminado 0.13.3 terminaltables 3.1.0 testfixtures 6.18.3 testpath 0.6.0 texttable 1.6.4 threadpoolctl 3.0.0 tifffile 2021.10.12 tokenizers 0.10.3 toml 0.10.2 torch 1.8.2+cu111 torchvision 0.9.2+cu111 tornado 6.1 tqdm 4.62.3 traitlets 5.1.1 transformers 4.16.2 typing-extensions 3.7.4.3 ubelt 0.10.1 urllib3 1.26.7 wcwidth 0.2.5 webencodings 0.5.1 Werkzeug 2.0.2 wheel 0.37.1 xdoctest 0.15.10 yapf 0.31.0

raymondlo84 commented 2 years ago

I tried with the CPU only and it crashed with 32GB of RAM as well.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 2 years ago

This issue was closed because it has been stalled for 30 days with no activity.