nv-tlabs / NKSR

[CVPR 2023 Highlight] Neural Kernel Surface Reconstruction
https://research.nvidia.com/labs/toronto-ai/NKSR
Other
735 stars 43 forks source link

ValueError("Matrix must be orthogonal, i.e. its transpose should be its inverse") #31

Closed anonymouslosty closed 1 year ago

anonymouslosty commented 1 year ago

Hello, I just run the script "python train.py configs/points2surf/train.yaml " and it raises some errors. Epoch 0: 0%| | 0/5045 [00:00<?, ?it/s]/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pycg/isometry.py:336: RuntimeWarning: invalid value encountered in divide z_dir /= np.linalg.norm(z_dir) Traceback (most recent call last): File "/home/dev05/main/NKSR/train.py", line 279, in trainer.fit(net_model, ckpt_path=last_ckpt_path) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit call._call_and_handle_interrupt( File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt return trainer_fn(*args, kwargs) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _fit_impl self._run(model, ckpt_path=self.ckpt_path) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1112, in _run results = self._run_stage() File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1191, in _run_stage self._run_train() File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1214, in _run_train self.fit_loop.run() File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 199, in run self.advance(*args, *kwargs) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance self._outputs = self.epoch_loop.run(self._data_fetcher) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 199, in run self.advance(args, kwargs) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 213, in advance batch_output = self.batch_loop.run(kwargs) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 199, in run self.advance(*args, kwargs) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance outputs = self.optimizer_loop.run(optimizers, kwargs) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 199, in run self.advance(*args, *kwargs) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 202, in advance result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position]) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 241, in _run_optimization closure() File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 149, in call self._result = self.closure(args, kwargs) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 135, in closure step_output = self._step_fn() File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 419, in _training_step training_step_output = self.trainer._call_strategy_hook("training_step", kwargs.values()) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1494, in _call_strategy_hook output = fn(args, kwargs) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 378, in training_step return self.model.training_step(*args, *kwargs) File "/home/dev05/main/NKSR/models/base_model.py", line 139, in training_step return self.train_val_step(is_val=False, args, kwargs) File "/home/dev05/main/NKSR/models/nksr_net.py", line 250, in train_val_step self.log_visualizations(batch, out, batch_idx) File "/home/dev05/main/NKSR/models/nksr_net.py", line 216, in log_visualizations self.log_geometry("pd_mesh", mesh) File "/home/dev05/main/NKSR/models/base_model.py", line 323, in log_geometry mv_img = render.multiview_image( File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pycg/render.py", line 1831, in multiview_image scene.quick_camera(w=width, h=height, fov=45.0, up_axis=up_axis, File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pycg/render.py", line 1188, in quick_camera self.camera_pose = Isometry.look_at(np.asarray(pos), np.asarray(look_at), up_axis).validified() File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pycg/isometry.py", line 347, in look_at return Isometry(q=Quaternion(matrix=R, rtol=1.0, atol=1.0), t=source) File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pyquaternion/quaternion.py", line 101, in init self.q = Quaternion._from_matrix(kwargs["matrix"], **optional_args).q File "/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pyquaternion/quaternion.py", line 181, in _from_matrix raise ValueError("Matrix must be orthogonal, i.e. its transpose should be its inverse") ValueError: Matrix must be orthogonal, i.e. its transpose should be its inverse

/home/dev05/.conda/envs/NKS/lib/python3.10/site-packages/pyquaternion/quaternion.py(181)_from_matrix() -> raise ValueError("Matrix must be orthogonal, i.e. its transpose should be its inverse")

I wondered if i installed some packages that the versions didn't match each other. Or there migth be some other reasons that cause this issue? Hope for your reply soon.

您好,感谢您提供的源码。但是我在尝试跑示例代码points2surf的时候出现了问题。我不知道问题出现在哪里,或许是我库的版本不对,我看报错原因是矩阵运算的问题,但是数据我是从您提供的链接上下载的。所以您能提供一个解决的思路吗?

heiwang1997 commented 1 year ago

Hey thanks for reporting this mistake. This is indeed due to some inconsistencies in the packages that we've used! The error is that we cannot render a mesh at the beginning of the training because it is possibly empty. I've now pushed a fix to this error. Please pull the latest version of the codebase and try running again. Thanks!

anonymouslosty commented 1 year ago

It worked ! Thank you.

And I am sorry that I have encountered another question. I run the recons_waymo.py in the example folder and it turns out well. BUT when I try to replace the demo which uses chunks with my own .ply data. The result can be annoying.

I acquired the data from NeRF and the coordinate scale was [-1,1]. So I enlarged the coordinate axis data through MeshLab. Here is the parameters I set in the script and the .ply file I use in the test.I have no idea why it was splitted in different parts. My running environment is python 3.10, torch2.0.0+cu118, Ubuntu20.04,RTX 2080ti. Hope for your reply.

Best. image

image

image image

anonymouslosty commented 1 year ago

It's solved! I found that the scale of the point cloud might be too small which may not appropriate for the chunk size. I will close this issue. Thanks!