Closed phongnhhn92 closed 4 years ago
Actually the first data I tried was real forward facing scene but due to coronavirus I can only think of my messy desktop to capture, so I didn't post it haha... It works quite well except some flickering frames, which my be due to bad lighting in the room.
Concerning your data, the photos look good, one concern is that maybe it covers too wide range that colmap cannot handle. I will take a look. If you are using local pc, you can also try to run colmap with the gui, not the imgs2poses.py
to see what reconstruction looks like.
Strange, it works perfectly using colmap gui
. Are you able to run colmap gui
?
I can recover the poses correctly.
Edit: although it reconstructs.. the poses don't seem to be correct. So I recommend two attempts:
Hi, It is weird that the script imgs2poses.py
can not estimate the poses but the colmap gui
is able to do it. I have tested my images using colmap gui and the sparse reconstruction works. I am trying to run Dense reconstruction to see the differences.
Btw, how can u tell that the poses don't seem to be correct ? I have similar sparse reconstruction with you but I have no idea how to evaluate this. Can you clarify ?
Actually, I am curious on how does this NERF works on large-scale scene. For example, can we test this on large dataset such as ScanNet, Matterport or DTU dataset.
In fact, I have capture a new set of images with lateral movement (not so much camera rotation) and this is my result. As you can see, the printer looks good but the background is not that good. My intial thought is that this NERF model doesnt work that well with far objects (like hallways). I dont know if there are any quick parameter fixes that we can change to train the model.
Another issue is that if I am using colmap gui then how can I compute this pose_bound,npy file. I guess this is an necessary file for both training and testing.
For example the 001 and 035 are almost rotated by 90 degrees, but the reconstruction looks like there's no rotation... maybe I'm wrong, it's just personal estimation.
Currently the constraint is the world space, you can only have two kinds:
So complex structures like matterport won't work, since they are more like 360 outward facing which doesn't satisfy the above constraint. At the limit DTU still works (I tried on 1 scene) since it satisfies constraint 2.
For your new data, I reckon that the result is reasonable. As I mentioned above, the world space must be fully behind a certain plane, so anything behind that plane won't be correct, which explains the result on the left and right part (it might be due to scarce data on the edge as well).
To make it work on 360 outward facing or even more complex scenes, although I think the concept still works, it'll be lots of work:
These are just some thoughts. Anyway I think it's a total new research topic, so there's no easy way to do that.
Finally for poses_bounds.npy
, you can use colmap gui
to generate sparse reconstructions first then call imgs2poses.py
with the same argument. It will skip colmap part and only extract the bounds.
Thanks for your clarification ! It make sense now.
I trained a model with 32 inward-facing images.
Any advice on what might be going wrong? I used full resolution images for COLMAP. Added the --spheric argument for training. While training some of the epochs (1, 9, 11, 13, 18, 26) did not complete training. Also, I let the model train till epoch 30 but checkpoints were saved only for epoch 17, 20, 22, 23 and 25. I used epoch25 checkpoint to render novel poses using eval.py.
I am not sure why the training fails exactly but I guess your training images have complicated background. This NERF model doesn't do well with cluttered background. Why dont you try to put the pan on a white table and put the camera closer to it ? I doubt it will work this time.
Yeah started training again with a white background. Let's see now..
@3ventHoriz0n What do you mean by "didn't complete training"? By default I only save the best 5 epochs, that's why you only have 5 ckpts at the end. Every epoch should finish normally. You can change the number here: https://github.com/kwea123/nerf_pl/blob/f02913b8cec85ee1e65813064224270dfa9d60e1/train.py#L160
Oh missed that part.
Well, usually when an epoch finishes, the progress bar is replaced by progress for next epoch. at any given time I only see one progress bar on the screen. But for the epochs I mentioned, I saw the progress bar stuck midway and for the next epoch a new progress bar would load. So I had 7 progress bars on the screen. 6 of them stuck midway and the last one for current epoch. Don't know what that means though.
sometimes if you accidentally perturb the terminal (like accidentally pressed a key), it interrupts the progress bar, so a new progress bar appears and the old one will be left on the terminal and looks like it was stuck. It's just visual bug, doesn't affect training.
Can you share the sparse folder generated by colmap? And the poses_bounds.npy
file.
Also the training log files.
This is what I see from your training log, the center image is the prediction, I didn't see anything wrong. Also the poses seem correct. How did you generate that noisy image?
I ran eval.py using checkpoint epoch=28 and dataset_name llff.
can you also tell me how you are using tensor board to visualize predictions?
Maybe you forgot to add --spheric_poses
in evaluation? Is it indeed not mentioned in readme, I will add that.
yes I did forget to add that. I'll try again.
can you share the checkpoint?
There might be need to modify these two lines in order to get good visual result: https://github.com/kwea123/nerf_pl/blob/d41ae302dd3d186f2f12fb411d8874a1d004e00d/datasets/llff.py#L130-L131 it controls where the virtual camera is placed. This part is actually hard-coded currently, I'm still finding a way to let it adapt to various scenes. For your data I find
trans_t = lambda t : np.array([
[1,0,0,0],
[0,1,0,-0.6*t],
[0,0,1,0.7*t],
[0,0,0,1],
])
is good. This is what I get after the above modification
Can you please explain what exactly is happening here? Also, I uploaded all necessary files to my google drive because I wanted to run the exact-mesh notebook but when I run the cell to search for tight bounds, the runtime restarts.
Adding --spheric_poses
generated this gif. Looks good except the top part has been cut off and there's a strange cloud of white dust at one location.
this is the translation wrt poses center. the second line controls the height offset and the third line controls the distance offset
Yes like I said currently you need to manually tune the position as mentioned above, but it is only for visualization, for mesh extraction this code doesn't have effect.
ok will try the modification now.
for the camera above, colorless mesh extraction worked perfectly. However when I tried to extract a colored mesh this was the result
I found tight bounds at x,y: -0.4, 0.3 and z: -1.25, -0.55. I tried sigma threshold values from 5 to 45 in increments of 5. I tried occlusion threshold values from 0.05 to 0.2 in increments of 0.05.
What do you think is going wrong?
looks like the images are rotated by 90 degrees... can you try manually rotate them by +90 (or -90) then feed to the program?
Alright, will do that. I have faced this while reading iPhone captured images using Pillow and openCV, smh..
I fixed the EXIF data of images and ran the experiment again. The resulting colored mesh is still not satisfactory. Any advice?
1. Data Folder (Includes images and LLFF output files) 2. eval.py output 3. .ply file 4. Checkpoint 5. Colored mesh video
@3ventHoriz0n sorry, I misupdated the master code. I reverted it just now, please re-pull the code and retry the extract_mesh
with the same parameters, it should give good results.
Done. Here's the final result.
Hi @kwea123, thanks for your work!
it controls where the virtual camera is placed. This part is actually hard-coded currently, I'm still finding a way to let it adapt to various scenes.
I was wondering do you find any good way for adaptive render pose generation? Currently I find it quite hard to set correct poses manually, therefore I'm using the interpolated c2ws from the training set. It is working but the camera movement is not satisfactory (shaky, jittering speed etc.) Do you have any suggestions?
@kwea123,hello! do you have read the paper pixelNeRF? I just cant understand the part of hardcoding for generating render pose for DTU dataset.
Hello, I have followed your example to train NERF on my own data. So I have seen you and other guys have some success with single object scene (silica model). How about the real scene (fern or orchids dataset)?
I have captured a video of my office link. However, I cant use colmap to estimate poses to train NERF model. Since you are more experienced than me on this project. Can u show me some suggestion ? It's interesting to see if this method works on real data like this.
This is the error from the colmap: