Error running DeepFloyd-IF only

Description

Hi, thank you for this excellent software. I have stable dream fusion working except when I run it with DeepFloyd-IF guidance. It starts to run but then errors out. It is probably a problem with my environment but I'm not sure how to fix it.

Thank you in advance,

root@d8ac8ab95a3b:/workspace# python main.py --text "a hamburger" --workspace trial -O --IF NOTE! Installing ujson may make loading annotations faster. Namespace(file=None, text='a hamburger', negative='', O=True, O2=False, test=False, six_views=False, eval_interval=1, test_interval=100, workspace='trial', seed=None, image=None, image_config=None, known_view_interval=4, IF=True, guidance=['IF'], guidance_scale=100, save_mesh=False, mcubes_resolution=256, decimate_target=50000.0, dmtet=False, tet_grid_size=128, init_with='', lock_geo=False, perpneg=False, negative_w=-2, front_decay_factor=2, side_decay_factor=10, iters=10000, lr=0.001, ckpt='latest', cuda_ray=True, taichi_ray=False, max_steps=1024, num_steps=64, upsample_steps=32, update_extra_interval=16, max_ray_batch=4096, latent_iter_ratio=0, albedo_iter_ratio=0, min_ambient_ratio=0.1, textureless_ratio=0.2, jitter_pose=False, jitter_center=0.2, jitter_target=0.2, jitter_up=0.02, uniform_sphere_rate=0, grad_clip=-1, grad_clip_rgb=-1, bg_radius=1.4, density_activation='exp', density_thresh=10, blob_density=5, blob_radius=0.2, backbone='grid', optim='adan', sd_version='2.1', hf_key=None, fp16=True, vram_O=False, w=64, h=64, known_view_scale=1.5, known_view_noise_scale=0.002, dmtet_reso_scale=8, batch_size=1, bound=1, dt_gamma=0, min_near=0.01, radius_range=[3.0, 3.5], theta_range=[45, 105], phi_range=[-180, 180], fovy_range=[10, 30], default_radius=3.2, default_polar=90, default_azimuth=0, default_fovy=20, progressive_view=False, progressive_view_init_ratio=0.2, progressive_level=False, angle_overhead=30, angle_front=60, t_range=[0.02, 0.98], dont_override_stuff=False, lambda_entropy=0.001, lambda_opacity=0, lambda_orient=0.01, lambda_tv=0, lambda_wd=0, lambda_mesh_normal=0.5, lambda_mesh_laplacian=0.5, lambda_guidance=1, lambda_rgb=1000, lambda_mask=500, lambda_normal=0, lambda_depth=10, lambda_2d_normal_smooth=0, lambda_3d_normal_smooth=0, save_guidance=False, save_guidance_interval=10, gui=False, W=800, H=800, radius=5, fovy=20, light_theta=60, light_phi=0, max_spp=1, zero123_config='./pretrained/zero123/sd-objaverse-finetune-c_concat-256.yaml', zero123_ckpt='pretrained/zero123/zero123-xl.ckpt', zero123_grad_scale='angle', dataset_size_train=100, dataset_size_valid=8, dataset_size_test=100, exp_start_iter=0, exp_end_iter=10000, images=None, ref_radii=[], ref_polars=[], ref_azimuths=[], zero123_ws=[], default_zero123_w=1) NeRFNetwork( (encoder): GridEncoder: input_dim=3 num_levels=16 level_dim=2 resolution=16 -> 2048 per_level_scale=1.3819 params=(6098120, 2) gridtype=hash align_corners=False interpolation=smoothstep (sigma_net): MLP( (net): ModuleList( (0): Linear(in_features=32, out_features=64, bias=True) (1): Linear(in_features=64, out_features=64, bias=True) (2): Linear(in_features=64, out_features=4, bias=True) ) ) (encoder_bg): FreqEncoder: input_dim=3 degree=6 output_dim=39 (bg_net): MLP( (net): ModuleList( (0): Linear(in_features=39, out_features=32, bias=True) (1): Linear(in_features=32, out_features=3, bias=True) ) ) ) [INFO] loading DeepFloyd IF-I-XL...

A mixture of fp16 and non-fp16 filenames will be loaded. Loaded fp16 filenames: [safety_checker/model.fp16.safetensors, text_encoder/model.fp16-00001-of-00002.safetensors, text_encoder/model.fp16-00002-of-00002.safetensors, unet/diffusion_pytorch_model.fp16.safetensors] Loaded non-fp16 filenames: [watermarker/diffusion_pytorch_model.safetensors If this behavior is not expected, please check your folder structure. Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 3.76it/s] Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:01<00:00, 3.91it/s] [INFO] loaded DeepFloyd IF-I-XL! Traceback (most recent call last): File "/workspace/main.py", line 396, in trainer = Trainer(' '.join(sys.argv), 'df', opt, model, guidance, device=device, workspace=opt.workspace, optimizer=optimizer, ema_decay=0.95, fp16=opt.fp16, lr_scheduler=scheduler, use_checkpoint=opt.ckpt, scheduler_update_every_step=True) File "/workspace/nerf/utils.py", line 263, in init self.prepare_embeddings() File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, kwargs) File "/workspace/nerf/utils.py", line 366, in prepare_embeddings self.embeddings['IF']['default'] = self.guidance['IF'].get_text_embeds([self.opt.text]) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/workspace/guidance/if_utils.py", line 68, in get_text_embeds embeddings = self.text_encoder(inputs.input_ids.to(self.device))[0] File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py", line 1964, in forward encoder_outputs = self.encoder( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py", line 1123, in forward layer_outputs = layer_module( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py", line 695, in forward self_attention_outputs = self.layer[0]( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/t5/modeling_t5.py", line 601, in forward normed_hidden_states = self.layer_norm(hidden_states) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/usr/local/lib/python3.10/dist-packages/apex/normalization/fused_layer_norm.py", line 386, in forward return fused_rms_norm_affine(input, self.weight, self.normalized_shape, self.eps) File "/usr/local/lib/python3.10/dist-packages/apex/normalization/fused_layer_norm.py", line 189, in fused_rms_norm_affine return FusedRMSNormAffineFunction.apply(args) File "/usr/local/lib/python3.10/dist-packages/torch/autograd/function.py", line 506, in apply return super().apply(args, kwargs) # type: ignore[misc] File "/usr/local/lib/python3.10/dist-packages/apex/normalization/fused_layer_norm.py", line 69, in forward output, invvar = fused_layer_norm_cuda.rms_forward_affine( RuntimeError: expected scalar type Float but found Half Exception ignored in: <function Trainer.del at 0x7f236c60acb0> Traceback (most recent call last): File "/workspace/nerf/utils.py", line 424, in del if self.log_ptr: AttributeError: 'Trainer' object has no attribute 'log_ptr'

Steps to Reproduce

python main.py --text "a hamburger" --workspace trial -O --IF

Expected Behavior

No error

Environment

Running from docker container: FROM nvcr.io/nvidia/pytorch:23.06-py3 Python 3.10.6 PyTorch 2.1.0 CUDA Version: 12.2 GPU: NVIDIA GeForce RTX 3090

ashawkey / stable-dreamfusion