Open Junyi42 opened 1 year ago
@Junyi42 Hi, thanks for the effort!
@ashawkey tried new version with stable-diffusion 2.0, but i get this error ? The previous version was running, i did only a "git pull" ? What do i wrong ?
python main.py --text "a hamburger" --workspace trial2 -O
Namespace(text='a hamburger', negative='', O=True, O2=False, test=False, save_mesh=False, eval_interval=10, workspace='trial2', guidance='stable-diffusion', seed=0, iters=10000, lr=0.001, ckpt='latest', cuda_ray=True, max_steps=512, num_steps=64, upsample_steps=32, update_extra_interval=16, max_ray_batch=4096, albedo=False, albedo_iters=1000, uniform_sphere_rate=0.5, bg_radius=1.4, density_thresh=10, fp16=True, backbone='grid', sd_version='2.0', w=64, h=64, jitter_pose=False, bound=1, dt_gamma=0, min_near=0.1, radius_range=[1.0, 1.5], fovy_range=[40, 70], dir_text=True, suppress_face=False, angle_overhead=30, angle_front=60, lambda_entropy=0.0001, lambda_opacity=0, lambda_orient=0.01, lambda_smooth=0, gui=False, W=800, H=800, radius=3, fovy=60, light_theta=60, light_phi=0, max_spp=1)
NeRFNetwork(
(encoder): GridEncoder: input_dim=3 num_levels=16 level_dim=2 resolution=16 -> 2048 per_level_scale=1.3819 params=(903480, 2) gridtype=tiled align_corners=False interpolation=linear
(sigma_net): MLP(
(net): ModuleList(
(0): Linear(in_features=32, out_features=64, bias=True)
(1): Linear(in_features=64, out_features=64, bias=True)
(2): Linear(in_features=64, out_features=4, bias=True)
)
)
(encoder_bg): FreqEncoder: input_dim=3 degree=4 output_dim=27
(bg_net): MLP(
(net): ModuleList(
(0): Linear(in_features=27, out_features=64, bias=True)
(1): Linear(in_features=64, out_features=3, bias=True)
)
)
)
[INFO] try to load hugging face access token from the default place, make sure you have run `huggingface-cli login`.
[INFO] loading stable diffusion...
The config attributes {'dual_cross_attention': False, 'use_linear_projection': True} were passed to UNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Traceback (most recent call last):
File "C:\Users\SuperUserName\git\stable-dreamfusion\main.py", line 141, in <module>
guidance = StableDiffusion(device, opt.sd_version)
File "C:\Users\SuperUserName\git\stable-dreamfusion\nerf\sd.py", line 47, in __init__
self.unet = UNet2DConditionModel.from_pretrained(model_key, subfolder="unet", use_auth_token=self.token).to(self.device)
File "C:\Users\SuperUserName\anaconda3\lib\site-packages\diffusers\modeling_utils.py", line 412, in from_pretrained
model, unused_kwargs = cls.from_config(
File "C:\Users\SuperUserName\anaconda3\lib\site-packages\diffusers\configuration_utils.py", line 169, in from_config
model = cls(**init_dict)
File "C:\Users\SuperUserName\anaconda3\lib\site-packages\diffusers\configuration_utils.py", line 406, in inner_init
init(self, *args, **init_kwargs)
File "C:\Users\SuperUserName\anaconda3\lib\site-packages\diffusers\models\unet_2d_condition.py", line 135, in __init__
down_block = get_down_block(
File "C:\Users\SuperUserName\anaconda3\lib\site-packages\diffusers\models\unet_blocks.py", line 65, in get_down_block
return CrossAttnDownBlock2D(
File "C:\Users\SuperUserName\anaconda3\lib\site-packages\diffusers\models\unet_blocks.py", line 508, in __init__
out_channels // attn_num_head_channels,
TypeError: unsupported operand type(s) for //: 'int' and 'list'
i did inside the anaconda prompt "pip install --upgrade diffusers[torch]" . Then it complained about missing tensorboard, which i installed with "pip install tensorboard" , now it returns :
python main.py --text "a hamburger" --workspace trial2 -O
Namespace(text='a hamburger', negative='', O=True, O2=False, test=False, save_mesh=False, eval_interval=10, workspace='trial2', guidance='stable-diffusion', seed=0, iters=10000, lr=0.001, ckpt='latest', cuda_ray=True, max_steps=512, num_steps=64, upsample_steps=32, update_extra_interval=16, max_ray_batch=4096, albedo=False, albedo_iters=1000, uniform_sphere_rate=0.5, bg_radius=1.4, density_thresh=10, fp16=True, backbone='grid', sd_version='2.0', w=64, h=64, jitter_pose=False, bound=1, dt_gamma=0, min_near=0.1, radius_range=[1.0, 1.5], fovy_range=[40, 70], dir_text=True, suppress_face=False, angle_overhead=30, angle_front=60, lambda_entropy=0.0001, lambda_opacity=0, lambda_orient=0.01, lambda_smooth=0, gui=False, W=800, H=800, radius=3, fovy=60, light_theta=60, light_phi=0, max_spp=1)
NeRFNetwork(
(encoder): GridEncoder: input_dim=3 num_levels=16 level_dim=2 resolution=16 -> 2048 per_level_scale=1.3819 params=(903480, 2) gridtype=tiled align_corners=False interpolation=linear
(sigma_net): MLP(
(net): ModuleList(
(0): Linear(in_features=32, out_features=64, bias=True)
(1): Linear(in_features=64, out_features=64, bias=True)
(2): Linear(in_features=64, out_features=4, bias=True)
)
)
(encoder_bg): FreqEncoder: input_dim=3 degree=4 output_dim=27
(bg_net): MLP(
(net): ModuleList(
(0): Linear(in_features=27, out_features=64, bias=True)
(1): Linear(in_features=64, out_features=3, bias=True)
)
)
)
[INFO] try to load hugging face access token from the default place, make sure you have run `huggingface-cli login`.
[INFO] loading stable diffusion...
C:\Users\SuperUserName\anaconda3\lib\site-packages\diffusers\utils\deprecation_utils.py:35: FutureWarning: It is deprecated to pass a pretrained model name or path to `from_config`.If you were trying to load a scheduler, please use <class 'diffusers.schedulers.scheduling_ddim.DDIMScheduler'>.from_pretrained(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0.
warnings.warn(warning + message, FutureWarning)
Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 308/308 [00:00<00:00, 309kB/s]
C:\Users\SuperUserName\anaconda3\lib\site-packages\huggingface_hub\file_download.py:123: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\SuperUserName\.cache\huggingface\diffusers. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
[INFO] loaded stable diffusion!
[INFO] Trainer: df | 2022-12-03_17-13-08 | cuda | fp16 | trial2
[INFO] #parameters: 1815479
[INFO] Loading latest checkpoint ...
[WARN] No checkpoint found, model randomly initialized.
==> Start Training trial2 Epoch 1, lr=0.010000 ...
0% 0/100 [00:00<?, ?it/s]╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\SuperUserName\git\stable-dreamfusion\main.py:160 in <module> │
│ │
│ 157 │ │ │ valid_loader = NeRFDataset(opt, device=device, type='val', H=opt.H, W=opt.W, │
│ 158 │ │ │ │
│ 159 │ │ │ max_epoch = np.ceil(opt.iters / len(train_loader)).astype(np.int32) │
│ ❱ 160 │ │ │ trainer.train(train_loader, valid_loader, max_epoch) │
│ │
│ C:\Users\SuperUserName\git\stable-dreamfusion\nerf\utils.py:486 in train │
│ │
│ 483 │ │ for epoch in range(self.epoch + 1, max_epochs + 1): │
│ 484 │ │ │ self.epoch = epoch │
│ 485 │ │ │ │
│ ❱ 486 │ │ │ self.train_one_epoch(train_loader) │
│ 487 │ │ │ │
│ 488 │ │ │ if self.workspace is not None and self.local_rank == 0: │
│ 489 │ │ │ │ self.save_checkpoint(full=True, best=False) │
│ │
│ C:\Users\SuperUserName\git\stable-dreamfusion\nerf\utils.py:698 in train_one_epoch │
│ │
│ 695 │ │ │ # update grid every 16 steps │
│ 696 │ │ │ if self.model.cuda_ray and self.global_step % self.opt.update_extra_interval │
│ 697 │ │ │ │ with torch.cuda.amp.autocast(enabled=self.fp16): │
│ ❱ 698 │ │ │ │ │ self.model.update_extra_state() │
│ 699 │ │ │ │
│ 700 │ │ │ self.local_step += 1 │
│ 701 │ │ │ self.global_step += 1 │
│ │
│ C:\Users\SuperUserName\anaconda3\lib\site-packages\torch\autograd\grad_mode.py:27 in decorate_context │
│ │
│ 24 │ │ @functools.wraps(func) │
│ 25 │ │ def decorate_context(*args, **kwargs): │
│ 26 │ │ │ with self.clone(): │
│ ❱ 27 │ │ │ │ return func(*args, **kwargs) │
│ 28 │ │ return cast(F, decorate_context) │
│ 29 │ │
│ 30 │ def _wrap_generator(self, func): │
│ │
│ C:\Users\SuperUserName\git\stable-dreamfusion\nerf\renderer.py:625 in update_extra_state │
│ │
│ 622 │ │ │ │ │ │ # add noise in [-hgs, hgs] │
│ 623 │ │ │ │ │ │ cas_xyzs += (torch.rand_like(cas_xyzs) * 2 - 1) * half_grid_size │
│ 624 │ │ │ │ │ │ # query density │
│ ❱ 625 │ │ │ │ │ │ sigmas = self.density(cas_xyzs)['sigma'].reshape(-1).detach() │
│ 626 │ │ │ │ │ │ # assign │
│ 627 │ │ │ │ │ │ tmp_grid[cas, indices] = sigmas │
│ 628 │
│ │
│ C:\Users\SuperUserName\git\stable-dreamfusion\nerf\network_grid.py:150 in density │
│ │
│ 147 │ def density(self, x): │
│ 148 │ │ # x: [N, 3], in [-bound, bound] │
│ 149 │ │ │
│ ❱ 150 │ │ sigma, albedo = self.common_forward(x) │
│ 151 │ │ │
│ 152 │ │ return { │
│ 153 │ │ │ 'sigma': sigma, │
│ │
│ C:\Users\SuperUserName\git\stable-dreamfusion\nerf\network_grid.py:80 in common_forward │
│ │
│ 77 │ │ # x: [N, 3], in [-bound, bound] │
│ 78 │ │ │
│ 79 │ │ # sigma │
│ ❱ 80 │ │ h = self.encoder(x, bound=self.bound) │
│ 81 │ │ │
│ 82 │ │ h = self.sigma_net(h) │
│ 83 │
│ │
│ C:\Users\SuperUserName\anaconda3\lib\site-packages\torch\nn\modules\module.py:1130 in _call_impl │
│ │
│ 1127 │ │ # this function, and just call forward. │
│ 1128 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1129 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1130 │ │ │ return forward_call(*input, **kwargs) │
│ 1131 │ │ # Do not call functions when jit is used │
│ 1132 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1133 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ C:\Users\SuperUserName\git\stable-dreamfusion\gridencoder\grid.py:156 in forward │
│ │
│ 153 │ │ prefix_shape = list(inputs.shape[:-1]) │
│ 154 │ │ inputs = inputs.view(-1, self.input_dim) │
│ 155 │ │ │
│ ❱ 156 │ │ outputs = grid_encode(inputs, self.embeddings, self.offsets, self.per_level_scal │
│ 157 │ │ outputs = outputs.view(prefix_shape + [self.output_dim]) │
│ 158 │ │ │
│ 159 │ │ #print('outputs', outputs.shape, outputs.dtype, outputs.min().item(), outputs.ma │
│ │
│ C:\Users\SuperUserName\anaconda3\lib\site-packages\torch\cuda\amp\autocast_mode.py:110 in decorate_fwd │
│ │
│ 107 │ def decorate_fwd(*args, **kwargs): │
│ 108 │ │ if cast_inputs is None: │
│ 109 │ │ │ args[0]._fwd_used_autocast = torch.is_autocast_enabled() │
│ ❱ 110 │ │ │ return fwd(*args, **kwargs) │
│ 111 │ │ else: │
│ 112 │ │ │ autocast_context = torch.is_autocast_enabled() │
│ 113 │ │ │ args[0]._fwd_used_autocast = False │
│ │
│ C:\Users\SuperUserName\git\stable-dreamfusion\gridencoder\grid.py:54 in forward │
│ │
│ 51 │ │ else: │
│ 52 │ │ │ dy_dx = None │
│ 53 │ │ │
│ ❱ 54 │ │ _backend.grid_encode_forward(inputs, embeddings, offsets, outputs, B, D, C, L, S │
│ 55 │ │ │
│ 56 │ │ # permute back to [B, L * C] │
│ 57 │ │ outputs = outputs.permute(1, 0, 2).reshape(B, L * C) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: grid_encode_forward(): incompatible function arguments. The following argument types are supported:
1. (arg0: at::Tensor, arg1: at::Tensor, arg2: at::Tensor, arg3: at::Tensor, arg4: int, arg5: int, arg6: int, arg7: int, arg8: float, arg9: int, arg10: Optional[at::Tensor], arg11: int, arg12: bool) -> None
Invoked with: tensor([[0.0062, 0.0011, 0.0017],
[0.0064, 0.0054, 0.0135],
[0.0018, 0.0071, 0.0187],
...,
[0.9993, 0.9997, 0.9817],
[0.9962, 0.9957, 0.9886],
[0.9980, 0.9975, 0.9924]], device='cuda:0'), tensor([[-7.7486e-07, 5.3644e-05],
[-8.2314e-05, -7.3612e-05],
[-3.8505e-05, 2.6822e-05],
...,
[-6.2644e-05, -2.3842e-06],
[-7.7724e-05, -8.1122e-05],
[-1.8597e-05, -7.2241e-05]], device='cuda:0', dtype=torch.float16), tensor([ 0, 4920, 18744, 51512, 117048, 182584, 248120, 313656, 379192,
444728, 510264, 575800, 641336, 706872, 772408, 837944, 903480],
device='cuda:0', dtype=torch.int32), tensor([[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
...,
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]]], device='cuda:0', dtype=torch.float16), 2097152, 3, 2, 16, 0.46666666666666684, 16, None, 1, False, 0
0% 0/100 [00:00<?, ?it/s]
@flobotics Hi, you should rebuild gridencoder too: pip install ./gridencoder
.
@ashawkey thanks it works.
if the results are better/faster i dont know now :) (still interrested in cloud-gpu usage :))
good work
@Junyi42 Hi, thanks for the effort!
- I'm trying 2.0-base too, what prompts are you using that generates worse results compared to 1.5?
- I think the submodule should work too, and for 2.0 this is the only choice.
Thanks for the reply!
--albedo
, --lambda_entropy 1e-5
were set to avoid empty scenes. I think these settings may affect the results and I am trying the other backbone too (I'll update once I find something).Thanks again for the wonderful work!
Hey, I was trying for the most recent stable diffusion v2, and find only below changes make it run well.
Describe alternatives you've considered In
sd.py
, from:change to:
Two points really confuse me are
Any help will be greatly appreciated!