google / samurai

SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image collections - NeurIPS2022
Apache License 2.0
117 stars 12 forks source link

tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 21504 values, but the requested shape has 23256 [Op:Reshape] #10

Closed monajalal closed 10 months ago

monajalal commented 1 year ago

When I run the following command I get an error. Is this expected? All other steps before this ran successfully.

(samurai) mona@ard-gpu-01:~/samurai$ python train_samurai.py --config configs/samurai/samurai.txt --datadir data/duck/ --basedir . --expname duck_test --gpu 0
Namespace(config=None, basedir='.', expname='duck_test', batch_size=1024, learning_rate=0.0001, epochs=150, steps_per_epoch=2000, gpu='0', tpu=None, debug=False, profile=False, perturb=1.0, raw_noise_std=0.0, coarse_samples=64, linear_disparity_sampling=False, fine_samples=128, fourier_frequency=10, direction_fourier_frequency=4, random_encoding_offsets=True, fine_net_width=128, fine_net_depth=8, coarse_net_width=128, coarse_net_depth=6, appearance_latent_dim=32, diffuse_latent_dim=24, fix_diffuse=True, camera_distribution='sphere', use_fully_random_cameras=False, random_cameras_per_view=4, min_softmax_scaler=1.0, max_softmax_scaler=10.0, camera_weight_update_lr=0.3, camera_weight_update_momentum=0.75, bounding_size=0.5, resolution_factor=4, advanced_loss_done=80000, network_gradient_norm_clipping=0.1, camera_gradient_norm_clipping=-1, not_learn_r=False, not_learn_t=False, not_learn_f=False, edge_align_step=200, num_edge_align_steps=50, pretrained_camera_poses_folder=None, start_f_optimization=90000, start_fourier_anneal=0, finish_fourier_anneal=50000, slow_scheduler_decay=100000, brdf_schedule_decay=40000, lambda_smoothness=0.01, smoothness_bound_dividier=200, coarse_distortion_lambda=0.001, fine_distortion_lambda=0, normal_direction_lambda=0.005, mlp_normal_direction_lambda=0.0003, disable_posterior_scaling=False, disable_mask_uncertainty=True, lambda_brdf_decoder_smoothness=0.1, lambda_brdf_decoder_sparsity=0.01, camera_lr=0.003, camera_lr_decay=70, camera_regularization=0.1, aim_center_regularization=10.0, camera_rotation='lookat', learn_camera_offsets=True, basecolor_metallic=True, skip_decomposition=False, compose_on_white=True, rotating_object=False, single_env=False, brdf_preintegration_path='data/neural_pil/BRDFLut.hdr', illumination_network_path='data/neural_pil/illumination-network', datadir='data/duck/', max_resolution_dimension=400, test_holdout=16, dataset='samurai', load_gt_poses=False, canonical_pose=0, log_step=100, weights_epoch=5, validation_epoch=5, testset_epoch=150, video_epoch=50, lrate_decay=300, render_only=False)
2023-06-21 14:54:53.957871: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-21 14:54:53.994892: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-21 14:54:53.995021: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Utilizing 1 GPUs for training.
2023-06-21 14:54:55.124354: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-06-21 14:54:55.125403: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-21 14:54:55.125582: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-21 14:54:55.125650: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-21 14:54:55.566934: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-21 14:54:55.567056: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-21 14:54:55.567131: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-21 14:54:55.567189: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2023-06-21 14:54:55.567365: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14117 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
(70, 3)
Model: "sequential_12"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 MappingNetwork/Layer_0 (Den  (None, 128)              16512     
 se)                                                             

 MappingNetwork/Layer_1 (Den  (None, 128)              16512     
 se)                                                             

 MappingNetwork/Final (Dense  (None, 768)              99072     
 )                                                               

 reshape_1 (Reshape)         (None, 2, 3, 128)         0         

=================================================================
Total params: 132,096
Trainable params: 132,096
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_13"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 ConditionalNetwork/Dense1 (  (None, 32)               192       
 Dense)                                                          

 ConditionalNetwork/DenseFin  (None, 256)              8448      
 al (Dense)                                                      

 reshape_2 (Reshape)         (None, 2, 128)            0         

=================================================================
Total params: 8,640
Trainable params: 8,640
Non-trainable params: 0
_________________________________________________________________
2023-06-21 14:55:14.488595: I tensorflow/core/util/cuda_solvers.cc:179] Creating GpuSolver handles for stream 0x564eca5ad750
*******************************************************************************************************
batch size is:  1024
2023-06-21 14:55:15.434935: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
Found ckpts []
Starting training in epoch 0 at step 0
Start Training...
/home/mona/anaconda3/envs/samurai/lib/python3.9/site-packages/tensorflow/python/framework/indexed_slices.py:444: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/interpolate_bilinear/gather-bottom_right/GatherV2_grad/Reshape_1:0", shape=(1024,), dtype=int32), values=Tensor("gradients/interpolate_bilinear/gather-bottom_right/GatherV2_grad/Reshape:0", shape=(1024, 1), dtype=float32), dense_shape=Tensor("gradients/interpolate_bilinear/gather-bottom_right/GatherV2_grad/Cast:0", shape=(2,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  warnings.warn(
/home/mona/anaconda3/envs/samurai/lib/python3.9/site-packages/tensorflow/python/framework/indexed_slices.py:444: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/interpolate_bilinear/gather-bottom_left/GatherV2_grad/Reshape_1:0", shape=(1024,), dtype=int32), values=Tensor("gradients/interpolate_bilinear/gather-bottom_left/GatherV2_grad/Reshape:0", shape=(1024, 1), dtype=float32), dense_shape=Tensor("gradients/interpolate_bilinear/gather-bottom_left/GatherV2_grad/Cast:0", shape=(2,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  warnings.warn(
/home/mona/anaconda3/envs/samurai/lib/python3.9/site-packages/tensorflow/python/framework/indexed_slices.py:444: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/interpolate_bilinear/gather-top_right/GatherV2_grad/Reshape_1:0", shape=(1024,), dtype=int32), values=Tensor("gradients/interpolate_bilinear/gather-top_right/GatherV2_grad/Reshape:0", shape=(1024, 1), dtype=float32), dense_shape=Tensor("gradients/interpolate_bilinear/gather-top_right/GatherV2_grad/Cast:0", shape=(2,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  warnings.warn(
/home/mona/anaconda3/envs/samurai/lib/python3.9/site-packages/tensorflow/python/framework/indexed_slices.py:444: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/interpolate_bilinear/gather-top_left/GatherV2_grad/Reshape_1:0", shape=(1024,), dtype=int32), values=Tensor("gradients/interpolate_bilinear/gather-top_left/GatherV2_grad/Reshape:0", shape=(1024, 1), dtype=float32), dense_shape=Tensor("gradients/interpolate_bilinear/gather-top_left/GatherV2_grad/Cast:0", shape=(2,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  warnings.warn(
2023-06-21 14:55:49.242434: I tensorflow/core/grappler/optimizers/generic_layout_optimizer.cc:345] Cancel Transpose nodes around Pad: transpose_before=gradient_tape/gradient_tape/sequential/dense_3/ReluGrad_1-0-0-TransposeNCDHWToNDHWC-LayoutOptimizer pad=gradient_tape/gradient_tape/Pad_3 transpose_after=AddN_115-1-TransposeNDHWCToNCDHW-LayoutOptimizer
2023-06-21 14:55:49.242517: I tensorflow/core/grappler/optimizers/generic_layout_optimizer.cc:345] Cancel Transpose nodes around Pad: transpose_before=gradient_tape/gradient_tape/sequential/dense_3/ReluGrad_2-0-0-TransposeNCDHWToNDHWC-LayoutOptimizer pad=gradient_tape/gradient_tape/Pad_5 transpose_after=AddN_129-1-TransposeNDHWCToNCDHW-LayoutOptimizer
  35/2000 [..............................] - ETA: 5:34 - loss: 1.5692 - loss_camera: 7.2616 - fine_loss: 1.81542023-06-21 14:56:18.246087: I tensorflow/core/grappler/optimizers/generic_layout_optimizer.cc:345] Cancel Transpose nodes around Pad: transpose_before=gradient_tape/gradient_tape/sequential/dense_3/ReluGrad_1-0-0-TransposeNCDHWToNDHWC-LayoutOptimizer pad=gradient_tape/gradient_tape/Pad_3 transpose_after=AddN_115-1-TransposeNDHWCToNCDHW-LayoutOptimizer
2023-06-21 14:56:18.246178: I tensorflow/core/grappler/optimizers/generic_layout_optimizer.cc:345] Cancel Transpose nodes around Pad: transpose_before=gradient_tape/gradient_tape/sequential/dense_3/ReluGrad_2-0-0-TransposeNCDHWToNDHWC-LayoutOptimizer pad=gradient_tape/gradient_tape/Pad_5 transpose_after=AddN_129-1-TransposeNDHWCToNCDHW-LayoutOptimizer
2000/2000 [==============================] - 591s 276ms/step - loss: 0.5750 - loss_camera: 2.7521 - fine_loss: 0.6058
Rendering last datapoint
Traceback (most recent call last):
  File "/home/mona/samurai/train_samurai.py", line 1442, in <module>
    main(args)
  File "/home/mona/samurai/train_samurai.py", line 536, in main
    render_full_datapoint(
  File "/home/mona/samurai/train_samurai.py", line 836, in render_full_datapoint
    tf.reshape(fine_result["direct_rgb"], (1, H, W, 3)),
  File "/home/mona/anaconda3/envs/samurai/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/mona/anaconda3/envs/samurai/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 21504 values, but the requested shape has 23256 [Op:Reshape]