Open Michaelwhite34 opened 2 years ago
Another out of memory when I stop the process
^Z
[1]+ Stopped python render_surface.py --data_dir ./data_flashlight/${SCENE}/train --out_dir ./exp_iron_stage2/${SCENE} --neus_ckpt_fpath ./exp_iron_stage1/${SCENE}/checkpoints/ckpt_100000.pth --num_iters 50001 --gamma_pred
ic| args: Namespace(data_dir='./data_flashlight/drv/rabbit/test', eik_weight=0.1, export_all=False, gamma_pred=True, init_light_scale=8.0, inv_gamma_gt=False, is_metal=False, neus_ckpt_fpath='./exp_iron_stage1/drv/rabbit/checkpoints/ckpt_100000.pth', no_edgesample=False, num_iters=50001, out_dir='./exp_iron_stage2/drv/rabbit', patch_size=128, plot_image_name=None, render_all=True, roughrange_weight=0.1, ssim_weight=1.0)
Wrote config file to ./exp_iron_stage2/drv/rabbit/args.txt
Traceback (most recent call last):
File "render_surface.py", line 136, in
Note at least 12GB GPU memory is needed for the default settings. You can try decreasing the rendered patch size if you have less memory.
Note at least 12GB GPU memory is needed for the default settings. You can try decreasing the rendered patch size if you have less memory.
I decreased batch_size and n_samples in womask_Iron and it only gives error in the final export mesh and uv stage. Can you tell me exactly which Parameter and file I should be modifying ?
100%|███████████████████████████████████| 50001/50001 [5:30:39<00:00, 2.52it/s] ic| f"Exporting mesh and materials to: {export_out_dir}": ('Exporting mesh and materials to: ' './exp_iron_stage2/drv/rabbit/mesh_and_materials_50000') ic| 'Exporting mesh and uv...'
face_normals incorrect shape, ignoring!
/home/michael/iron/models/exportmesh.py:82: UserWarning: torch.eig is deprecated in favor of torch.linalg.eig and will be removed in a future PyTorch release.
torch.linalg.eig returns complex tensors of dtype cfloat or cdouble rather than real tensors mimicking complex tensors.
L, = torch.eig(A)
should be replaced with
L_complex = torch.linalg.eigvals(A)
and
L, V = torch.eig(A, eigenvectors=True)
should be replaced with
L_complex, V_complex = torch.linalg.eig(A) (Triggered internally at ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2910.)
vecs = torch.eig(s_cov, True)[1].transpose(0, 1)
Traceback (most recent call last):
File "render_surface.py", line 549, in import imageio.v2 as imageio
or call imageio.v2.imread
directly.
im = imageio.imread(fpath).astype(np.float32) / 255.0
ic| len(image_fpaths): 82
gt_images.shape: torch.Size([82, 768, 1024, 3])
Ks.shape: torch.Size([82, 4, 4])
W2Cs.shape: torch.Size([82, 4, 4])
len(cameras): 82
ic| args.neus_ckpt_fpath: './exp_iron_stage1/drv/rabbit/checkpoints/ckpt_100000.pth'
ic| f"Loading from neus checkpoint: {args.neus_ckpt_fpath}": ('Loading from neus checkpoint: '
'./exp_iron_stage1/drv/rabbit/checkpoints/ckpt_100000.pth')
ic| "Reloading from checkpoint: ": 'Reloading from checkpoint: '
ckpt_fpath: './exp_iron_stage2/drv/rabbit/ckpt_50000.pth'
ic| dist: 0.8803050220012665
color_network_dict["point_light_network"].light.data: tensor(1.7133, device='cuda:0')
ic| start_step: 50000
ic| f"Rendering images to: {render_out_dir}": 'Rendering images to: ./exp_iron_stage2/drv/rabbit/render_test_50000'
2%|█ | 2/82 [00:23<15:21, 11.52s/it]
Traceback (most recent call last):
File "render_surface.py", line 367, in
And another question, after preparing for my own images, I just need to run colmap_runner to get kai_cameras_normalized.json and rename it to cam_dict_norm.json ?
@Kai-46 yeah I met same problem,When trainning superman dataset.
I found in models/export_mesh.py
grid_points = torch.tensor(np.vstack([xx.ravel(), yy.ravel(), zz.ravel()]).T, dtype=torch.float).cuda()
xx, yy, zz size are huge
How to change the default settings?
You can use lower resolution to voxelize the neural SDF at a potential sacrifice of final mesh accuracy: https://github.com/Kai-46/IRON/blob/8e9a7c172542afd52b8e6ef28bc96ad52b5ffd5a/models/export_mesh.py#L50 .
train_scene.sh drv/rabbit Hello Wooden Load data: Begin Not using masks image shape, mask shape: torch.Size([324, 768, 1024, 3]) torch.Size([324, 768, 1024, 3]) image pixel range: 0.0 1.0 Load data: End 0%| | 0/100001 [00:00<?, ?it/s] Traceback (most recent call last): File "render_volume.py", line 449, in
runner.train()
File "render_volume.py", line 127, in train
render_out = self.renderer.render(
File "/home/michael/iron/models/renderer.py", line 374, in render
ret_fine = self.render_core(
File "/home/michael/iron/models/renderer.py", line 233, in render_core
gradients = sdf_network.gradient(pts)
File "/home/michael/iron/models/fields.py", line 110, in gradient
gradients = torch.autograd.grad(
File "/home/michael/anaconda3/envs/iron/lib/python3.8/site-packages/torch/autograd/init.py", line 275, in grad
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 5.80 GiB total capacity; 4.03 GiB already allocated; 118.56 MiB free; 4.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Wrote config file to ./exp_iron_stage2/drv/rabbit/args.txt
render_surface.py:256: DeprecationWarning: Starting with ImageIO v3 the behavior of this function will switch to that of iio.v3.imread. To keep the current behavior (and make this warning dissapear) use
import imageio.v2 as imageio
or callimageio.v2.imread
directly. im = imageio.imread(fpath).astype(np.float32) / 255.0 ic| fill_holes: False handle_edges: True is_training: True args.inv_gamma_gt: False 0%| | 0/50001 [00:00<?, ?it/s]ic| args.out_dir: './exp_iron_stage2/drv/rabbit' global_step: 0 loss.item(): 0.00573146715760231 img_loss.item(): 0.0 img_l2_loss.item(): 0.0 img_ssim_loss.item(): 0.0 eik_loss.item(): 0.00573146715760231 roughrange_loss.item(): 0.0 color_network_dict["point_light_network"].get_light().item(): 5.6220927238464355 1%|▎ | 499/50001 [01:35<3:20:37, 4.11it/s]ic| args.out_dir: './exp_iron_stage2/drv/rabbit' global_step: 500 loss.item(): 0.014144735410809517 img_loss.item(): 0.0 img_l2_loss.item(): 0.0 img_ssim_loss.item(): 0.0 eik_loss.item(): 0.014144735410809517 roughrange_loss.item(): 0.0 color_network_dict["point_light_network"].get_light().item(): 5.224419593811035