custom dataset & nan error

henry123-boy / Level-S2FM_official

This is the official codes for "Level-S2fM: Structure from Motion on Neural Level Set of Implicit Surfaces" accepted as CVPR2023.

https://henry123-boy.github.io/level-s2fm/

MIT License

124 stars 9 forks source link

custom dataset & nan error #17

Closed hdzmtsssw closed 9 months ago

hdzmtsssw commented 1 year ago

Hi, thanks for the great work! I tried to use your method to get a better camera pose on my own dataset, but failed. It raised a nan bug in sphere_tracing. After investigation, I found the method embed_fn and SDF_MLP will output nan.

My dataset only contains 30 images and can be reconstructed well on NeRF. I suppose the camera pose from colormap is generally correct. I follow preparation/README.md to create my dataset, and save intrinsics.txt and pose(Is it necessary? or just identity matrix?) from colmap output.

I wonder where might be wrong and how can I create my own dataset. The other question is what's the meaning of the output/mesh/cam000xxx and how to output the camera pose or convert the optimized camera pose to colmap style.

henry123-boy commented 1 year ago

Hi, thank sincerely for following our work! For problems your met, there are some potential solutions.

embed_fn and SDF_MLP nan ouput seems like because of the mixed precision of the hash table used in our work. We met the similar problem before and solved it by specifying the precision type of hash table of fp32 https://github.com/henry123-boy/Level-S2FM_official/blob/main/models/base.py#L17 you can try to add dtype=getattr(torch, float32) in this line.
For Custom Data preprocess. For the convenient implementation and more be focused on our core contribution, we just use the colmap to extract the keypoints and get the matches, which means poses are not necessary. But in our code, we output the poses evaluation after each registeration (where gt poses are needed), so you need to comment it out if there are no gt poses input.
For the camera poses output. We organize the optimized camera poses as the following: https://github.com/henry123-boy/Level-S2FM_official/blob/main/pipelines/base.py#L161 And these is for visualizing the camera in open3d. For convert camera poses into colmap format, you can refered https://colmap.github.io/faq.html And please attention that poses in output/mesh/cam000xxx are organized as W2C while colmap's is C2W

cherubicXN commented 1 year ago

Hi, please check out https://github.com/henry123-boy/Level-S2FM_official/tree/main/preparation for the customized scene.

hdzmtsssw commented 1 year ago

Thanks for the reply.

Thanks for your suggestion, I will try fp32.
I followed https://github.com/henry123-boy/Level-S2FM_official/tree/main/preparation, but it does not save intrinsics.txt. I add some code to save K from colmap as intrinsics. Is it right?
I noticed that you use prealign_cameras to align pose_gt to pose_est, what does it mean? I just want to export pose_est to images.bin, will it make a difference? Or should I align pose_est to pose_gt first before export?
What's the meaning of cam_i.id in https://github.com/henry123-boy/Level-S2FM_official/blob/main/pipelines/base.py#L161, does it correspond to id in images.bin from colmap? If not, how can I obtain the correspondence?
https://github.com/kwea123/ngp_pl/blob/HEAD/datasets/colmap.py#L65 in this code, I think immeta read from colmap images.bin is W2C. I'm not very sure. And the difference between the pose_est and clomap format is just w2c and c2w?
dataset/pose/xxx.txt is also W2C? Is it correspond to "W2C" in output/mesh/cam000xxx_gt.json in the same key in the json file?

BruceLz commented 1 year ago

I tried a lot of parameters and code changes on my own dataset, but to no avail. So how do I train on my own data set?

henry123-boy commented 1 year ago

I tried a lot of parameters and code changes on my own dataset, but to no avail. So how do I train on my own data set?

Hi, thank you for trying our work. Could you please provide more details about your dataset, and how did this fail in your cases? Did you confront of the problem of running our code or other issues like registration fail, nan ba loss ?

BruceLz commented 1 year ago

I tried a lot of parameters and code changes on my own dataset, but to no avail. So how do I train on my own data set?

Hi, thank you for trying our work. Could you please provide more details about your dataset, and how did this fail in your cases? Did you confront of the problem of running our code or other issues like registration fail, nan ba loss ?

I want to know the relationship between --group, --name and --dataset, --scene in parameters. If I have a "person1""act1" dataset, how do I set it? --group=person1 --name=act1 --dataset=person1 --scene=act1? And the "--yaml" parameter?

BruceLz commented 1 year ago

Finally! I modified preparation/main.py to save intrinsics.txt and then keep the parameters in train.py by setting: "--data.dataset=BlendedMVS","--data.scene=Character", the program is now running!

BruceLz commented 1 year ago

Sorry to bother you again. I found that it was not possible to get a good rgb_render.jpg in my scene, adjusting bgcolor resulted in a picture that was either pure black or barely gray (rgb channels equal). Also {:08d}.ply does not correspond to {view_ord}_pointcloud.ply in meshlab. Are there any special requirements for parameters opt.SDF, bound_max, bound_min, scale_init, and rad_init based on data sets in different scenarios?

henry123-boy commented 9 months ago

Sorry for not being able to reply in time. I was occupied with another projects in these several months. And for your question, the problem may caused by the inappropriate initialization for scale of camera poses and bound. The initialization details were discussed in the supplementary of our papers: https://openaccess.thecvf.com/content/CVPR2023/supplemental/Xiao_Level-S2fM_Structure_From_CVPR_2023_supplemental.pdf, you may try adjusting the initialized scale and bound first.