brabbitdousha / MIRReS-ReSTIR_Nerf_mesh

Source Code for the Paper "MIRReS: Multi-bounce Inverse Rendering using Reservoir Sampling"
MIT License
25 stars 2 forks source link

Please give me some advice #1

Closed MiracleMountain closed 4 months ago

MiracleMountain commented 4 months ago

I tried

#stage0:
python main.py data/tensoir_syn/tensoir_train/tensoir_lego --workspace ir_lego/ -O --bound 1 --scale 0.8 --dt_gamma 0 --stage 0 --lambda_tv 1e-8 --iters 50000

it returns

...
++> Evaluate epoch 500 Finished, loss = 0.808583
==> Start Test, save results to ir_lego/results, 
 brdf to None
100% 200/200 [00:13<00:00, 15.38it/s]
==> Finished Test.
==> Saving mesh to ir_lego/mesh_stage0
[F glutil.cpp:332] eglGetDisplay() failed

I download mesh_stage0.rar and unrar it into ir_lego/and tried commend, I get the same error

(mirres0) ubuntu@ubuntu-Precision-3660:~/workfile/nerf/MIRReS$ python main.py data/tensoir_syn/tensoir_train/tensoir_lego --workspace ir_lego/ -O --bound 1 --scale 0.8 --dt_gamma 0 --stage 1 --use_brdf --use_restir --lambda_kd 0.017 --lambda_ks 0.0001 --lambda_normal 0.0001 --lambda_edgelen 0.1 --lambda_nrm 0.00035 --lambda_rgb_brdf 0.05 --lambda_brdf_diffuse 0.002 --lambda_brdf_specular 0.00003
[F glutil.cpp:332] eglGetDisplay() failed
已放弃 (核心已转储)

I find code in glutil.cppin nvdiffrast

 if (!display)
 {
     display = eglGetDisplay(EGL_DEFAULT_DISPLAY);
     if (display == EGL_NO_DISPLAY)
         LOG(FATAL) << "eglGetDisplay() failed";
 }

copilot tell me to test if opengl works , i run glxgears and it does work(some rotating gears).

I also tried reinstall nvdiffrast, it doesn't help.

my environment: Ubuntu 20.04 , python 3.8 pytorch 2.01 cuda 11.8 nvdiffrast 0.3.1

Can you give me some advise, please?

Another question is how to switch to the debug model. (I notice that "If something went wrong with slangpy or other cpp extensions, switch to the debug mode") . I'm a rookie and don't know how to debug this python-cpp-cuda program. I just know how to debug pure cpp or python files in vscode and pychram.

Thank you!

brabbitdousha commented 4 months ago

Hi, I usually use the debug model by just selecting the interpreter of this project in VScode, it's on the bottom right, then create a launch.json and put commend in it. However, the cpp extensions issues often occur before training, it seems like a issue with your openGL, you can check this, try glxinfo -v | grep OpenGL to check your OpenGL

MiracleMountain commented 4 months ago

Thanks for your reply!

I run glxinfo -v | grep OpenGL and get this:

$ glxinfo -v | grep OpenGL
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: NVIDIA GeForce RTX 4090/PCIe/SSE2
OpenGL core profile version string: 4.6.0 NVIDIA 535.171.04
OpenGL core profile shading language version string: 4.60 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.6.0 NVIDIA 535.171.04
OpenGL shading language version string: 4.60 NVIDIA
OpenGL context flags: (none)
OpenGL profile mask: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.2 NVIDIA 535.171.04
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
OpenGL ES profile extensions:

it is almost identical to the correct output in your linkhttps://github.com/ashawkey/nerf2mesh/issues/16. This seems to mean that there is no problem with my opengl

But I run the samples in nvdiffrast python triangle.py --cuda It executes correctly. But I runpython triangle.py --opengl the same error occurs.

(mirres0) ubuntu@ubuntu-Precision-3660:~/workfile/nerf/MIRReS/nvdiffrast/samples/torch$ python triangle.py --cuda
Saving to 'tri.png'.# success
(mirres0) ubuntu@ubuntu-Precision-3660:~/workfile/nerf/MIRReS/nvdiffrast/samples/torch$ python triangle.py --opengl
[F glutil.cpp:332] eglGetDisplay() failed
已放弃 (核心已转储)

So I guess there is an issue with the opengl part of the nvidia driver, but there is no problem with the cuda part. I decide to reinstall my nvidia driver. I will try it later.

Thanks for your help!

MiracleMountain commented 4 months ago

I reinstall my nvidia driver and the previous error has been solved.

But unfortunately I get another problem:

[INFO] mesh mask trigs: (1982692, 3) --> (1113740, 3), (3940324, 3) --> (2209372, 3)
Traceback (most recent call last):
  File "main.py", line 317, in <module>
    trainer.save_mesh(resolution=opt.mcubes_reso, decimate_target=opt.decimate_target, dataset=train_loader._data if opt.mesh_visibility_culling else None)
 ...
  File "/home/ubuntu/workfile/nerf/MIRReS/meshutils.py", line 198, in clean_mesh
    ms.meshing_merge_close_vertices(threshold=pml.Percentage(v_pct)) # 1/10000 of bounding box diagonal
AttributeError: module 'pymeshlab' has no attribute 'Percentage'

I tried pip install pymeshlab==0.2(https://github.com/3DTopia/LGM/issues/2) (I find you in this issue lol)

but it cause a bounch of other new errors, so I upgrade pymeshlab back to 2023.12.post1

Accroding to the https://pymeshlab.readthedocs.io/en/latest/classes/percentage_value.html#percentagevalue

I change all percentage to PercentageValue in meshutils.py.

    if v_pct > 0:
        ms.meshing_merge_close_vertices(threshold=pml.PercentageValue(v_pct)) # 1/10000 of bounding box diagonal############# I change percentage to PercentageValue

    ms.meshing_remove_duplicate_faces() # faces defined by the same verts
    ms.meshing_remove_null_faces() # faces with area == 0

    if min_d > 0:
        ms.meshing_remove_connected_component_by_diameter(mincomponentdiag=pml.Percentage(min_d))

And till now, the stage0 seems work successfully.

LPIPS brdf (vgg) = 0.484467
++> Evaluate epoch 500 Finished, loss = 0.808583
==> Start Test, save results to ir_lego/results, 
 brdf to None
100% 200/200 [00:06<00:00, 33.59it/s]==> Finished Test.
100% 200/200 [00:07<00:00, 25.39it/s]
==> Saving mesh to ir_lego/mesh_stage0
100%|████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 1033.67it/s]
[mark unseen trigs] 2529631 from 3940324
[INFO] mesh mask trigs: (1982692, 3) --> (1113740, 3), (3940324, 3) --> (2209372, 3)
[INFO] mesh cleaning: (1113740, 3) --> (1035367, 3), (2209372, 3) --> (2054884, 3)
[INFO] mesh decimation: (1035367, 3) --> (151206, 3), (2054884, 3) --> (300000, 3)
==> Finished saving mesh.
(mirres0) ubuntu@ubuntu-Precision-3660:~/workfile/nerf/MIRReS$ 

Thanks!

brabbitdousha commented 4 months ago

I reinstall my nvidia driver and the previous error has been solved.

But unfortunately I get another problem:

[INFO] mesh mask trigs: (1982692, 3) --> (1113740, 3), (3940324, 3) --> (2209372, 3)
Traceback (most recent call last):
  File "main.py", line 317, in <module>
    trainer.save_mesh(resolution=opt.mcubes_reso, decimate_target=opt.decimate_target, dataset=train_loader._data if opt.mesh_visibility_culling else None)
 ...
  File "/home/ubuntu/workfile/nerf/MIRReS/meshutils.py", line 198, in clean_mesh
    ms.meshing_merge_close_vertices(threshold=pml.Percentage(v_pct)) # 1/10000 of bounding box diagonal
AttributeError: module 'pymeshlab' has no attribute 'Percentage'

I tried pip install pymeshlab==0.2(3DTopia/LGM#2) (I find you in this issue lol)

but it cause a bounch of other new errors, so I upgrade pymeshlab back to 2023.12.post1

Accroding to the https://pymeshlab.readthedocs.io/en/latest/classes/percentage_value.html#percentagevalue

I change all percentage to PercentageValue in meshutils.py.

    if v_pct > 0:
        ms.meshing_merge_close_vertices(threshold=pml.PercentageValue(v_pct)) # 1/10000 of bounding box diagonal############# I change percentage to PercentageValue

    ms.meshing_remove_duplicate_faces() # faces defined by the same verts
    ms.meshing_remove_null_faces() # faces with area == 0

    if min_d > 0:
        ms.meshing_remove_connected_component_by_diameter(mincomponentdiag=pml.Percentage(min_d))

And till now, the stage0 seems work successfully.

LPIPS brdf (vgg) = 0.484467
++> Evaluate epoch 500 Finished, loss = 0.808583
==> Start Test, save results to ir_lego/results, 
 brdf to None
100% 200/200 [00:06<00:00, 33.59it/s]==> Finished Test.
100% 200/200 [00:07<00:00, 25.39it/s]
==> Saving mesh to ir_lego/mesh_stage0
100%|████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 1033.67it/s]
[mark unseen trigs] 2529631 from 3940324
[INFO] mesh mask trigs: (1982692, 3) --> (1113740, 3), (3940324, 3) --> (2209372, 3)
[INFO] mesh cleaning: (1113740, 3) --> (1035367, 3), (2209372, 3) --> (2054884, 3)
[INFO] mesh decimation: (1035367, 3) --> (151206, 3), (2054884, 3) --> (300000, 3)
==> Finished saving mesh.
(mirres0) ubuntu@ubuntu-Precision-3660:~/workfile/nerf/MIRReS$ 

Thanks!

yes, I remember I found that problem with pymeshlab, I used to try pymeshlab==2022.2.post4, and everything works fine without changing the code, anyway, I added this in requirements.txt