DSaurus / threestudio-dreamcraft3D

51 stars 0 forks source link

I got error at Stage 1 - **Predictions and targets are expected to have the same shape** #2

Open altava-sgp opened 6 months ago

altava-sgp commented 6 months ago

I'm using docker version of threestudio. And I'm using RTX 4090 ( VRAM 24GB )

This is error log.

$ python launch.py --config custom/threestudio-dreamcraft3D/configs/dreamcraft3d-coarse-nerf.yaml --train system.prompt_processor.prompt="a denim jacket with a vintage feel" data.image_path="load/images/denim-jacket-rgba.png"

Import times for custom modules:
   0.1 seconds: custom/threestudio-mvdream
   0.2 seconds: custom/threestudio-dreamcraft3D

Global seed set to 0
[INFO] Loading Deep Floyd ...

A mixture of fp16 and non-fp16 filenames will be loaded.
Loaded fp16 filenames:
[text_encoder/model.fp16-00001-of-00002.safetensors, safety_checker/model.fp16.safetensors, unet/diffusion_pytorch_model.fp16.safetensors, text_encoder/model.fp16-00002-of-00002.safetensors]
Loaded non-fp16 filenames:
[watermarker/diffusion_pytorch_model.safetensors
If this behavior is not expected, please check your folder structure.
Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 12.90it/s]
[INFO] Loaded Deep Floyd!
[INFO] Loading Stable Zero123 ...
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.53 M params.
Keeping EMAs of 688.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
[INFO] Loaded Stable Zero123!
[INFO] Using prompt [a denim jacket with a vintage feel] and negative prompt []
[INFO] Using view-dependent prompts [side]:[a denim jacket with a vintage feel, side view] [front]:[a denim jacket with a vintage feel, front view] [back]:[a denim jacket with a vintage feel, back view] [overhead]:[a denim jacket with a vintage feel, overhead view]
/home/dreamer/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/home/dreamer/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /home/dreamer/.cache/torch/hub/checkpoints/vgg16-397923af.pth
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 528M/528M [00:46<00:00, 11.9MB/s]
Downloading vgg_lpips model from https://heibox.uni-heidelberg.de/f/607503859c864bc1b30b/?dl=1 to threestudio/utils/lpips/vgg.pth
8.19kB [00:00, 107kB/s]                                                                                                                                                                                                                     
loaded pretrained LPIPS loss from threestudio/utils/lpips/vgg.pth
[INFO] ModelCheckpoint(save_last=True, save_top_k=-1, monitor=None) will duplicate the last checkpoint saved.
[INFO] Using 16bit Automatic Mixed Precision (AMP)
[INFO] GPU available: True (cuda), used: True
[INFO] TPU available: False, using: 0 TPU cores
[INFO] IPU available: False, using: 0 IPUs
[INFO] HPU available: False, using: 0 HPUs
[INFO] You are using a CUDA device ('NVIDIA GeForce RTX 4090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
[INFO] single image dataset: load image load/images/denim-jacket-rgba.png torch.Size([1, 128, 128, 3])
[INFO] single image dataset: load depth load/images/denim-jacket-rgba.png torch.Size([1, 128, 128, 4])
[INFO] single image dataset: load image load/images/denim-jacket-rgba.png torch.Size([1, 128, 128, 3])
[INFO] single image dataset: load depth load/images/denim-jacket-rgba.png torch.Size([1, 128, 128, 4])
[INFO] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[INFO] 
  | Name       | Type                 | Params
----------------------------------------------------
0 | geometry   | ImplicitVolume       | 12.6 M
1 | material   | NoMaterial           | 0     
2 | background | SolidColorBackground | 0     
3 | renderer   | NeRFVolumeRenderer   | 0     
----------------------------------------------------
12.6 M    Trainable params
0         Non-trainable params
12.6 M    Total params
50.417    Total estimated model params size (MB)
[INFO] Validation results will be saved to outputs/dreamcraft3d-coarse-nerf/a_denim_jacket_with_a_vintage_feel@20231219-063523/save
/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:442: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 20 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:442: PossibleUserWarning: The dataloader, val_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 20 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
Epoch 0: : 0it [00:00, ?it/s]Traceback (most recent call last):
  File "/home/dreamer/threestudio/launch.py", line 301, in <module>
    main(args, extras)
  File "/home/dreamer/threestudio/launch.py", line 244, in main
    trainer.fit(system, datamodule=dm, ckpt_path=cfg.resume)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in fit
    call._call_and_handle_interrupt(
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 571, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 980, in _run
    results = self._run_stage()
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1023, in _run_stage
    self.fit_loop.run()
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 202, in run
    self.advance()
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 355, in advance
    self.epoch_loop.run(self._data_fetcher)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 133, in run
    self.advance(data_fetcher)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 219, in advance
    batch_output = self.automatic_optimization.run(trainer.optimizers[0], kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 188, in run
    self._optimizer_step(kwargs.get("batch_idx", 0), closure)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 266, in _optimizer_step
    call._call_lightning_module_hook(
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 146, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/core/module.py", line 1270, in optimizer_step
    optimizer.step(closure=optimizer_closure)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/core/optimizer.py", line 161, in step
    step_output = self._strategy.optimizer_step(self._optimizer, closure, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 231, in optimizer_step
    return self.precision_plugin.optimizer_step(optimizer, model=model, closure=closure, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/amp.py", line 76, in optimizer_step
    closure_result = closure()
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 142, in __call__
    self._result = self.closure(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 128, in closure
    step_output = self._step_fn()
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 315, in _training_step
    training_step_output = call._call_strategy_hook(trainer, "training_step", *kwargs.values())
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 294, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 380, in training_step
    return self.model.training_step(*args, **kwargs)
  File "/home/dreamer/threestudio/custom/threestudio-dreamcraft3D/system/dreamcraft3d.py", line 426, in training_step
    out = self.training_substep(
  File "/home/dreamer/threestudio/custom/threestudio-dreamcraft3D/system/dreamcraft3d.py", line 192, in training_substep
    "depth_rel", 1 - self.pearson(valid_pred_depth, valid_gt_depth)
  File "/home/dreamer/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/torchmetrics/metric.py", line 296, in forward
    self._forward_cache = self._forward_full_state_update(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/torchmetrics/metric.py", line 311, in _forward_full_state_update
    self.update(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/torchmetrics/metric.py", line 467, in wrapped_func
    raise err
  File "/home/dreamer/.local/lib/python3.10/site-packages/torchmetrics/metric.py", line 457, in wrapped_func
    update(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/torchmetrics/regression/pearson.py", line 146, in update
    self.mean_x, self.mean_y, self.var_x, self.var_y, self.corr_xy, self.n_total = _pearson_corrcoef_update(
  File "/home/dreamer/.local/lib/python3.10/site-packages/torchmetrics/functional/regression/pearson.py", line 53, in _pearson_corrcoef_update
    _check_same_shape(preds, target)
  File "/home/dreamer/.local/lib/python3.10/site-packages/torchmetrics/utilities/checks.py", line 42, in _check_same_shape
    raise RuntimeError(
RuntimeError: Predictions and targets are expected to have the same shape, but got torch.Size([6322]) and torch.Size([6322, 4]).

What can I do ?

DSaurus commented 6 months ago

Hi @altava-sgp ,

It seems that you don't have the corresponding depth/normal maps. I have updated the preprocessing script to get depth and normal maps.

altava-sgp commented 6 months ago

We need to install this package. https://github.com/nadermx/backgroundremover

altava-sgp commented 6 months ago

This is success log.

$ python image_preprocess.py "examples/denim-jacket.png" --size 1024 --border_ratio 0.0 --need_caption --recenter
usage: image_preprocess.py [-h] [--size SIZE] [--border_ratio BORDER_RATIO] [--recenter RECENTER] [--dont_recenter] [--need_caption] path
image_preprocess.py: error: argument --recenter: expected one argument
$ python image_preprocess.py "examples/denim-jacket.png" --size 1024 --border_ratio 0.0 --need_caption           
[INFO] loading image...
[INFO] background removal...
examples/denim-jacket_rgba.png
downloading model [u2net] to /home/dreamer/.u2net/u2net.pth ...
downloading part 1 of u2net
finished downloading part 1 of u2net
downloading part 2 of u2net
finished downloading part 2 of u2net
downloading part 3 of u2net
finished downloading part 3 of u2net
downloading part 4 of u2net
finished downloading part 4 of u2net
[INFO] depth estimation...
Downloading (…)ta_dpt_depth_v2.ckpt: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.95G/1.95G [02:53<00:00, 11.2MB/s]
/home/dreamer/.local/lib/python3.10/site-packages/timm/models/_factory.py:114: UserWarning: Mapping deprecated model name vit_base_resnet50_384 to current vit_base_r50_s16_384.orig_in21k_ft_in1k.
  model = create_fn(
Downloading model.safetensors: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 396M/396M [00:33<00:00, 11.7MB/s]
[INFO] normal estimation...
Downloading (…)a_dpt_normal_v2.ckpt: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.95G/1.95G [02:56<00:00, 11.0MB/s]
/home/dreamer/.local/lib/python3.10/site-packages/timm/models/_factory.py:114: UserWarning: Mapping deprecated model name vit_base_resnet50_384 to current vit_base_r50_s16_384.orig_in21k_ft_in1k.
  model = create_fn(
[INFO] recenter...
[INFO] captioning...
Downloading (…)rocessor_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 432/432 [00:00<00:00, 1.30MB/s]
Downloading tokenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 904/904 [00:00<00:00, 3.03MB/s]
Downloading vocab.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 798k/798k [00:00<00:00, 1.33MB/s]
Downloading merges.txt: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 456k/456k [00:00<00:00, 769kB/s]
Downloading tokenizer.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.11M/2.11M [00:00<00:00, 2.12MB/s]
Downloading (…)cial_tokens_map.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 548/548 [00:00<00:00, 2.28MB/s]
Downloading config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.96k/6.96k [00:00<00:00, 18.9MB/s]
Downloading (…)model.bin.index.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 122k/122k [00:00<00:00, 9.50MB/s]
Downloading (…)l-00001-of-00002.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10.0G/10.0G [14:42<00:00, 11.3MB/s]
Downloading (…)l-00002-of-00002.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.50G/5.50G [08:06<00:00, 11.3MB/s]
Downloading shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [22:49<00:00, 684.79s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:10<00:00,  5.24s/it]
a denim jacket with buttons and a brown and blue color scheme
DSaurus commented 6 months ago

Okay, I have added backgroundremover in the installation section.

knicholes commented 4 months ago

You probably used your .png instead of the post-processed _rgba.png