Open Jordain opened 5 months ago
Hello @Jordain,
I am using RTX 3060 GPU with 6GB of VRAM. I was following your tutorial but got the following error:
(three_studio_venv) arghya@arghya-Pulse-GL66-12UEK:~/three_studio/threestudio$ python launch.py --config configs/stable-zero123.yaml --train --gpu 0 system.prompt_processor.prompt="a realistic thor hammer with default texture"
/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/controlnet_aux/mediapipe_face/mediapipe_face_common.py:7: UserWarning: The module 'mediapipe' is not installed. The package will have limited functionality. Please install it using the command: pip install 'mediapipe'
warnings.warn(
/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_5m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_5m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_11m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_11m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_384 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_384. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_512 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_512. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
Seed set to 0
[INFO] GPU available: True (cuda), used: True
[INFO] TPU available: False, using: 0 TPU cores
[INFO] IPU available: False, using: 0 IPUs
[INFO] HPU available: False, using: 0 HPUs
[INFO] You are using a CUDA device ('NVIDIA GeForce RTX 3060 Laptop GPU') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
/home/arghya/three_studio/threestudio/threestudio/data/image.py:93: UserWarning: Using torch.cross without specifying the dim arg is deprecated.
Please either pass the dim explicitly or simply use torch.linalg.cross.
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at ../aten/src/ATen/native/Cross.cpp:63.)
right: Float[Tensor, "1 3"] = F.normalize(torch.cross(lookat, up), dim=-1)
[INFO] single image dataset: load image ./load/images/thorhammer_rgba.png torch.Size([1, 128, 128, 3])
[INFO] single image dataset: load image ./load/images/thorhammer_rgba.png torch.Size([1, 128, 128, 3])
[INFO] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[INFO]
| Name | Type | Params
-------------------------------------------------------------
0 | geometry | ImplicitVolume | 12.6 M
1 | material | DiffuseWithPointLightMaterial | 0
2 | background | SolidColorBackground | 0
3 | renderer | NeRFVolumeRenderer | 0
-------------------------------------------------------------
12.6 M Trainable params
0 Non-trainable params
12.6 M Total params
50.450 Total estimated model params size (MB)
[INFO] Validation results will be saved to outputs/zero123-sai/[64, 128, 256]_thorhammer_rgba.png@20240207-192339/save
[INFO] Loading Stable Zero123 ...
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.53 M params.
Keeping EMAs of 688.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
[INFO] Loaded Stable Zero123!
/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=19` in the `DataLoader` to improve performance.
/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=19` in the `DataLoader` to improve performance.
Epoch 0: | | 0/? [00:00<?, ?it/s]Traceback (most recent call last):
File "launch.py", line 301, in <module>
main(args, extras)
File "launch.py", line 244, in main
trainer.fit(system, datamodule=dm, ckpt_path=cfg.resume)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
call._call_and_handle_interrupt(
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 989, in _run
results = self._run_stage()
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1035, in _run_stage
self.fit_loop.run()
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 202, in run
self.advance()
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 359, in advance
self.epoch_loop.run(self._data_fetcher)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 136, in run
self.advance(data_fetcher)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 240, in advance
batch_output = self.automatic_optimization.run(trainer.optimizers[0], batch_idx, kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 187, in run
self._optimizer_step(batch_idx, closure)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 265, in _optimizer_step
call._call_lightning_module_hook(
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 157, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 1291, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 151, in step
step_output = self._strategy.optimizer_step(self._optimizer, closure, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 230, in optimizer_step
return self.precision_plugin.optimizer_step(optimizer, model=model, closure=closure, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision.py", line 117, in optimizer_step
return optimizer.step(closure=closure, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/optim/optimizer.py", line 385, in wrapper
out = func(*args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/optim/optimizer.py", line 76, in _use_grad
ret = func(self, *args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/optim/adam.py", line 146, in step
loss = closure()
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision.py", line 104, in _wrap_closure
closure_result = closure()
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 140, in __call__
self._result = self.closure(*args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 126, in closure
step_output = self._step_fn()
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 315, in _training_step
training_step_output = call._call_strategy_hook(trainer, "training_step", *kwargs.values())
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 309, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 382, in training_step
return self.lightning_module.training_step(*args, **kwargs)
File "/home/arghya/three_studio/threestudio/threestudio/systems/zero123.py", line 236, in training_step
out = self.training_substep(batch, batch_idx, guidance="zero123")
File "/home/arghya/three_studio/threestudio/threestudio/systems/zero123.py", line 76, in training_substep
out = self(batch)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/arghya/three_studio/threestudio/threestudio/systems/zero123.py", line 32, in forward
render_out = self.renderer(**batch)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/arghya/three_studio/threestudio/threestudio/models/renderers/nerf_volume_renderer.py", line 407, in forward
normal_perturb = self.geometry(
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/arghya/three_studio/threestudio/threestudio/models/geometry/implicit_volume.py", line 182, in forward
normal = -torch.autograd.grad(
File "/home/arghya/three_studio/three_studio_venv/lib/python3.8/site-packages/torch/autograd/__init__.py", line 411, in grad
result = Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 52.00 MiB. GPU 0 has a total capacity of 5.80 GiB of which 10.94 MiB is free. Including non-PyTorch memory, this process has 5.77 GiB memory in use. Of the allocated memory 5.43 GiB is allocated by PyTorch, and 13.19 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Here is my stable-zero123.yaml configuration file:
name: "zero123-sai"
tag: "${data.random_camera.height}_${rmspace:${basename:${data.image_path}},_}"
exp_root_dir: "outputs"
seed: 0
data_type: "single-image-datamodule"
data: # threestudio/data/image.py -> SingleImageDataModuleConfig
image_path: ./load/images/thorhammer_rgba.png
height: [128, 256, 512]
width: [128, 256, 512]
resolution_milestones: [200, 300]
default_elevation_deg: 5.0
default_azimuth_deg: 0.0
default_camera_distance: 3.8
default_fovy_deg: 20.0
requires_depth: ${cmaxgt0orcmaxgt0:${system.loss.lambda_depth},${system.loss.lambda_depth_rel}}
requires_normal: ${cmaxgt0:${system.loss.lambda_normal}}
random_camera: # threestudio/data/uncond.py -> RandomCameraDataModuleConfig
height: [64, 128, 256]
width: [64, 128, 256]
# batch_size: [12, 8, 4] ~ 5 hours
# batch_size: [8, 4, 2] ~ 50 mins
# batch_size: [4, 2, 1] ~ 5 mins
# batch_size: [1, 1, 1] ~ 3 mins
batch_size: [1, 1, 1]
resolution_milestones: [200, 300]
eval_height: 512
eval_width: 512
eval_batch_size: 1
elevation_range: [-10, 80]
azimuth_range: [-180, 180]
camera_distance_range: [3.8, 3.8]
fovy_range: [20.0, 20.0] # Zero123 has fixed fovy
progressive_until: 0
camera_perturb: 0.0
center_perturb: 0.0
up_perturb: 0.0
light_position_perturb: 1.0
light_distance_range: [7.5, 10.0]
eval_elevation_deg: ${data.default_elevation_deg}
eval_camera_distance: ${data.default_camera_distance}
eval_fovy_deg: ${data.default_fovy_deg}
light_sample_strategy: "dreamfusion"
batch_uniform_azimuth: False
n_val_views: 30
n_test_views: 120
system_type: "zero123-system"
system:
geometry_type: "implicit-volume"
geometry:
radius: 2.0
normal_type: "analytic"
# use Magic3D density initialization instead
density_bias: "blob_magic3d"
density_activation: softplus
density_blob_scale: 10.
density_blob_std: 0.5
# coarse to fine hash grid encoding
# to ensure smooth analytic normals
pos_encoding_config:
otype: HashGrid
n_levels: 16
n_features_per_level: 2
log2_hashmap_size: 19
base_resolution: 16
per_level_scale: 1.447269237440378 # max resolution 4096
mlp_network_config:
otype: "VanillaMLP"
activation: "ReLU"
output_activation: "none"
n_neurons: 64
n_hidden_layers: 2
material_type: "diffuse-with-point-light-material"
material:
ambient_only_steps: 100000
textureless_prob: 0.05
albedo_activation: sigmoid
background_type: "solid-color-background" # unused
renderer_type: "nerf-volume-renderer"
renderer:
radius: ${system.geometry.radius}
num_samples_per_ray: 512
return_comp_normal: ${cmaxgt0:${system.loss.lambda_normal_smooth}}
return_normal_perturb: ${cmaxgt0:${system.loss.lambda_3d_normal_smooth}}
prompt_processor_type: "dummy-prompt-processor" # Zero123 doesn't use prompts
prompt_processor:
pretrained_model_name_or_path: ""
prompt: ""
guidance_type: "stable-zero123-guidance"
guidance:
pretrained_config: "./load/zero123/sd-objaverse-finetune-c_concat-256.yaml"
pretrained_model_name_or_path: "./load/zero123/stable_zero123.ckpt"
vram_O: ${not:${gt0:${system.freq.guidance_eval}}}
cond_image_path: ${data.image_path}
cond_elevation_deg: ${data.default_elevation_deg}
cond_azimuth_deg: ${data.default_azimuth_deg}
cond_camera_distance: ${data.default_camera_distance}
guidance_scale: 3.0
min_step_percent: [50, 0.7, 0.3, 200] # (start_iter, start_val, end_val, end_iter)
max_step_percent: [50, 0.98, 0.8, 200]
freq:
ref_only_steps: 0
guidance_eval: 0
loggers:
wandb:
enable: false
project: "threestudio"
name: None
loss:
lambda_sds: 0.1
lambda_rgb: [100, 500., 1000., 400]
lambda_mask: 50.
lambda_depth: 0. # 0.05
lambda_depth_rel: 0. # [0, 0, 0.05, 100]
lambda_normal: 0. # [0, 0, 0.05, 100]
lambda_normal_smooth: [100, 7.0, 5.0, 150, 10.0, 200]
lambda_3d_normal_smooth: [100, 7.0, 5.0, 150, 10.0, 200]
lambda_orient: 1.0
lambda_sparsity: 0.5 # should be tweaked for every model
lambda_opaque: 0.5
optimizer:
name: Adam
args:
lr: 0.01
betas: [0.9, 0.99]
eps: 1.e-8
trainer:
max_steps: 600
log_every_n_steps: 1
num_sanity_val_steps: 0
val_check_interval: 100
enable_progress_bar: true
precision: 32
checkpoint:
save_last: true # save at each validation time
save_top_k: -1
every_n_train_steps: 100 # ${trainer.max_steps}
No batch size is working. The error that you see is from a batch size of [1, 1, 1]. I have tried the other batch sizes as well but the results are same. What should I do now ??
Also, I didn't understand one thing. What's the difference between your command showed in the video and the default command in this repo ??
python launch.py --config configs/dreamfusion-sd.yaml --train --gpu 0 system.prompt_processor.prompt="a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes"
I am a little new in this field. I just want to know the difference. Thanks in advance.
BTW, the default command in this repo works for me but for some reason, the above one using your command is not working.
You might not have enough VRAM to get it to run. Even with 24GB of VRAM it was a lot. Some people were able to get it to run with 12GB of VRAM, but I haven't heard anyone being able to run it with 6GB of VRAM.
https://www.youtube.com/watch?v=jaRr5W80N8E