Xharlie / pointnerf

Point-NeRF: Point-based Neural Radiance Fields
Other
1.09k stars 126 forks source link

FileNotFoundError of pointcloud #7

Closed Kathygg90 closed 2 years ago

Kathygg90 commented 2 years ago

Hi, thanks for your great job! I have the same problem when I run "bash dev_scripts/w_n360/ship_test.sh". It goes with "FileNotFoundError: [Errno 2] No such file or directory" in line 118 of load_blender.py. My datapath is pointnerf/data_src/nerf/nerf_synthetic/ship,and it includes 3 .json files and 3 folders containing some .png pictures. At the same time, the checkpoints folder only include some .pth files. It doesn't seem to contain the saved point cloud.

Could you please tell me where the "point_path" is? Thank you~

Kathygg90 commented 2 years ago

I saw your early reply, but I still haven't solved this problem https://github.com/Xharlie/pointnerf/issues/3#issuecomment-1051281065

Xharlie commented 2 years ago

the point cloud is in checkpoint file, since each point has optimizable features, they are considered network parameters. If you have follow exactly the steps in README, you should have the checkpoints file in your checkpoints/nerf_synthetic/hotdog/ or ship folder that provides MLP and points with point features (in .pth files), and it should not go to line 248 of neural_points.np or line 118 of load_blender.py,

Please let me know if you change anything in the original script or your pth files are not in place at all. I took quit a lot effort to find a new machine and re-deploy everything, but haven't encounter any such error.

Kathygg90 commented 2 years ago

Thanks a lot for your prompt reply ~

Combining your specific reply and paper, I figured out the principle of point cloud generation. thank you again.

After I re-downloaded checkpoints and dev_scripts and replaced the old version, the error went away. Maybe when I changed gpu_ids in scripts, I accidentally changed something else? I compared the old and new sh line by line, but I didn't check it out. That's odd.

However, another error has occurred:

200 images computed
psnr: 30.971331
ssim: 0.942010
lpips: 0.070386
vgglpips: 0.124295
rmse: 0.028785
--------------------------------Finish Evaluation--------------------------------
end loading
end loading
-------------------------------------------------------------------
PyCUDA ERROR: The context stack was not empty upon module cleanup.
-------------------------------------------------------------------
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
-------------------------------------------------------------------
Aborted (core dumped)

So I just got the images result ,an empty vids folder and no points folder. (in the directory ./checkpoints/nerfsynth/ship/test_200000/... ).

In the Visualizer.py, I think the test result should have images, points, and vids.

class Visualizer:
def __init__(self, opt):
self.opt = opt
self.log_dir = os.path.join(opt.checkpoints_dir, opt.name)
self.image_dir = os.path.join(opt.checkpoints_dir, opt.name, 'images')
self.point_dir = os.path.join(opt.checkpoints_dir, opt.name, 'points')
self.vid_dir = os.path.join(opt.checkpoints_dir, opt.name, 'vids')
os.makedirs(self.vid_dir, exist_ok=True)

I tried to find out if Context.pop() is missing in the code, but I didn't find it. My environment is built on 2080ti, and the key-dependent libraries are:

python                3.6.13
torch                   1.8.1+cu102 
torch-scatter        2.0.8    
torchaudio           0.8.1          
torchvision           0.9.1+cu102 
pycuda                 2021.1 
scikit-image         0.17.2 
scikit-learn           0.24.2 
imageio                2.9.0 
h5py                    3.1.0  

I installed the libraries according to the version in README, could you please tell me the reason for this Pycuda error?

Xharlie commented 2 years ago

hi, the pycuda error is not fixable for now since the integration of pycuda and pytorch is tricky, we have context popped but somehow it's not clean.
In test_ft, line 134, there is a gen_vid flag you can set it as True so it will generate vid from your image results.

Gardlin commented 2 years ago

Hi, I'm wondering how do you get ''0_net_ray_marching.pth'', bacause when I move all ''*_net_ray_marching.pth'', It goes wrong with FileNotFound error. And I didn't find any instructions in the readme.md.

Xharlie commented 2 years ago

im not sure why do you remove all the check point files, but if you want to get 0_net_ray_marching.pth, you can start training from the scratch by following "Per-scene optimize from scatch" in readme

Gardlin commented 2 years ago

I followed your readme to run "Per-scene optimize from scatch" to get the 0_net_ray_marching.pth ,but it goes wrong with error: " No such file or directory:'' " It went to https://github.com/Xharlie/pointnerf/blob/39602fed3686a5881120546db37468480fd262ab/models/neural_points/neural_points.py#L248
I don't know how to change the setting .It will be very grateful for your response.

Xharlie commented 2 years ago

hi, you have the exact same error discussed previously in this thread, can you re-download the checkpoints and datasets and make sure they are in place? the exact error has been solved by following the README instruction step by step.

Gardlin commented 2 years ago

Hi , when I redownload the code repo and run from scratch ,It still encounters with the problem of "No such file or directory" when I run scene101.sh ,However it works fine if I run from scratch in the dataset of Nerf Synth. Is there some possiblity that I should change some setting for the scannet scene.

xxmd1132 commented 2 years ago

@Xharlie Hi, thanks for your great job! I have the exact same error of "No such file or directory".I try to check test_ft.py.In Line 316 of test_ft.py, set opt.load_points=1.I'm very confused, based on your answer

xxmd1132 commented 2 years ago

should I set opt.load_points=0 when i run bash dev_scripts/w_n360/ficus_test.sh.BTW,I had checked that the nrCheckpoint and nrDataRoot are all right.

Xharlie commented 2 years ago

You should not change anything, because all the scripts are runnable if you follow all steps correctly

xxmd1132 commented 2 years ago

@Xharlie I solved the error.In bash dev_scripts/w_n360/ficus_test.sh,the resume_iter=200000,but I download checkpoints folder from google drive, the .pth file is:0_net_ray_marching.pth and 20000_net_ray_marching.pth.because not exist file 200000_net_ray_marching.pth, checkpoint_path=None in func:get_additional_network_params of file:neural_points_volumetric_model.py cause the error

xxmd1132 commented 2 years ago

so, just set resume_iter=20000 in file dev_scripts/w_n360/ficus_test.sh. I guess other bash files in dev_scripts folder can be solved in a similar way.

xxmd1132 commented 2 years ago

thanks your amazing paper and release source code generously again~~~ :)

Xharlie commented 2 years ago

hi, if you check google drive, there are files like 200000_net_ray_marching.pth for all nerfsynth objects. sometime google drive will divide the content in folder and zip them separately during download, you have to manually unzip and move these files together. Just make sure your content matches the google drive, there will not be any problem running any of my script

xxmd1132 commented 2 years ago

OK, I'll try as you said. thank you~~

argosdh commented 2 years ago

I think the reason why they all encountered the same problem is that you placed two exactly same .sh file named scene101.sh and scene101_test.sh. If you check your w_scannet_etf directory, you will find they both executed "test_ft.py", however, based on code in scene241.sh and scene241_test.sh, it should be "python train_ft.py" and "python test_ft.py" respectively. https://github.com/Xharlie/pointnerf/blob/39602fed3686a5881120546db37468480fd262ab/dev_scripts/w_scannet_etf/scene101.sh#L101

LongruiDong commented 2 years ago

I think the reason why they all encountered the same problem is that you placed two exactly same .sh file named scene101.sh and scene101_test.sh. If you check your w_scannet_etf directory, you will find they both executed "test_ft.py", however, based on code in scene241.sh and scene241_test.sh, it should be "python train_ft.py" and "python test_ft.py" respectively.

https://github.com/Xharlie/pointnerf/blob/39602fed3686a5881120546db37468480fd262ab/dev_scripts/w_scannet_etf/scene101.sh#L101

i also think so.

clearly test_ft.py can not train (Per-scene optimize) at all.