Open wpl427 opened 10 months ago
Or how to reduce video memory usage by modifying the configuration?
you can try this open_clip.create_model_and_transforms('ViT-bigG-14', 'laion2b_s39b_b160k',precision='fp16')
in train_t2i_custom_v2.py
before:
promptmodel,,_ = open_clip.create_model_and_transforms('ViT-bigG-14', 'laion2b_s39b_b160k')
after:
promptmodel,,_ = open_clip.create_model_and_transforms('ViT-bigG-14', 'laion2b_s39b_b160k',precision='fp16')
error:
(faceswap) [root@prod-emr-gpu01 StyleDrop-PyTorch]# accelerate launch --num_processes 8 --mixed_precision fp16 train_t2i_custom_v2.py --config=configs/custom.py
2023-08-21 13:16:44.463755: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-21 13:16:45.329695: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
The following values were not passed to accelerate launch
and had defaults used instead:
--num_machines
was set to a value of 1
--num_cpu_threads_per_process
was set to 8
to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run accelerate config
.
2023-08-21 13:16:49.788896: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-08-21 13:16:52.551 | INFO | main:train:63 - Process 0 using device: cuda
I0821 13:16:52.551910 140480274581312 factory.py:158] Loaded ViT-bigG-14 model config.
2023-08-21 13:16:52.578 | DEBUG | open_clip.transformer:init:314 - xattn in transformer of CLIP is True
2023-08-21 13:17:09.847 | DEBUG | open_clip.transformer:init:314 - xattn in transformer of CLIP is True
I0821 13:17:20.080452 140480274581312 factory.py:206] Loading pretrained ViT-bigG-14 weights (laion2b_s39b_b160k).
Traceback (most recent call last):
File "/data/miniconda3/envs/faceswap/bin/accelerate", line 8, in
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 22.00 MiB (GPU 0; 14.61 GiB total capacity; 13.30 GiB already allocated; 9.19 MiB free; 13.78 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
What is the minimum video memory??