threestudio-project / threestudio

A unified framework for 3D content generation.
Apache License 2.0
6.18k stars 474 forks source link

zero123 doesn't work #275

Open YJ-142150 opened 1 year ago

YJ-142150 commented 1 year ago

Thank for great work! I tried python launch.py --config configs/zero123.yaml --train --gpu 0 data.image_path=./load/images/dog1_rgba.png, but I encountered the following error. How can I fix this?

bin /home/lambdasix/anaconda3/envs/threestudio/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so
/home/lambdasix/anaconda3/envs/threestudio/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: /home/lambdasix/anaconda3/envs/threestudio did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
  warn(msg)
CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.7/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/lambdasix/anaconda3/envs/threestudio/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
Traceback (most recent call last):
  File "/home/lambdasix/threestudio/launch.py", line 237, in <module>
    main(args, extras)
  File "/home/lambdasix/threestudio/launch.py", line 61, in main
    import pytorch_lightning as pl
  File "/home/lambdasix/anaconda3/envs/threestudio/lib/python3.10/site-packages/pytorch_lightning/__init__.py", line 20, in <module>
    from pytorch_lightning import metrics  # noqa: E402
  File "/home/lambdasix/anaconda3/envs/threestudio/lib/python3.10/site-packages/pytorch_lightning/metrics/__init__.py", line 15, in <module>
    from pytorch_lightning.metrics.classification import (  # noqa: F401
  File "/home/lambdasix/anaconda3/envs/threestudio/lib/python3.10/site-packages/pytorch_lightning/metrics/classification/__init__.py", line 14, in <module>
    from pytorch_lightning.metrics.classification.accuracy import Accuracy  # noqa: F401
  File "/home/lambdasix/anaconda3/envs/threestudio/lib/python3.10/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in <module>
    from pytorch_lightning.metrics.utils import deprecated_metrics, void
  File "/home/lambdasix/anaconda3/envs/threestudio/lib/python3.10/site-packages/pytorch_lightning/metrics/utils.py", line 22, in <module>
    from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/lambdasix/anaconda3/envs/threestudio/lib/python3.10/site-packages/torchmetrics/utilities/data.py)
bennyguo commented 1 year ago

which torchmetrics version are you using? could you please try installing torchmetrics==0.11.4?

YJ-142150 commented 1 year ago

I just tried pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118 and now it works. However, it seems my GPU has CUDA OOM error in amount of 300MB. I tried system.cleanup_after_validation_step=true and system.cleanup_after_test_step=true. But it doesn't seem to reduce the memory. How much VRAM does cleanup_after_test_step=true reduce?

sidsunny commented 1 year ago

Hi! I am also facing a CUDA OOM error. Could you please share what is the VRAM required to run zero1-to-3? I am using a 24G A5000 card.