Closed SolbiatiAlessandro closed 1 year ago
One possible solution that works for me is to use 4 RTX A6000 (3$ an hour). The setup was successful there and could start fine tuning.
Might also be possible to use "1 RTX A6000" without running out of memory by using other docker images among these
- pytorch:latest
f5540ef1a1398b8499546edb53dae704
PyTorch is a deep learning framework that puts Python first.
-nvidia-glx-desktop:latest
f10187106abdbf2eeef2c4d8347aa56f
Ubuntu X desktop streaming using WebRTC and NVENC GPU video compression. Supports Vulkan/OpenGL for GPU rendering. Default username: user password: mypasswd
- stable-diffusion:web-automatic-2.1.16
b41a1cd115aeaa64f26ac806ab654d01
Stable Diffusion with Automatic1111 WebUI, Jupyter (for file browser & transfer), and SSH.
- Whisper ASR Webservice
e795f6239ba0236393d61d892c3f4152
GPU version of the whisper ASR webservice for podcast and video transcription.
- Bittensor 3.7.0 with cubit
325f5bb932cd700e11d7913fe32fad51
Uses the Bittensor 3.7.0 docker image and installs cubit in the onstart script. Once complete, the instance will be ready to run Bittensor on the finney network.
- tensorflow:latest-gpu
79a8d3bee306ada066bb42cb3bdef852
Official docker images for deep learning framework TensorFlow (http://www.tensorflow.org)
-cuda:12.0.1-runtime-ubuntu20.04
e64e8c759efb02fb5e156600354f4c96
CUDA and cuDNN images from gitlab.com/nvidia/cuda
Hi, I haven't tested with other images rather simple default cuda image with current setup. Using Pytorch image might be the reason the disk is out of memory. Furthermore, it's not necessary to have pytorch image since the environment already has one. I think I have tested with cuda:12.0.1-runtime-ubuntu20.04
Trying as suggested in README.md to run this on "1 RTX A6000" with docker image
pytorch:latest f5540ef1a1398b8499546edb53dae704
from https://cloud.vast.ai/Returns out of memory error
Debugging memory