maybeLx / MVSFormerPlusPlus

Codes of MVSFormer++: Revealing the Devil in Transformer’s Details for Multi-View Stereo (ICLR2024)
Apache License 2.0
178 stars 6 forks source link

question about training on custom dataset #13

Open fariba87 opened 5 months ago

fariba87 commented 5 months ago

I appreciate your great work and i have some question I want to train the model on my custom data. here is my steps:

i captured some pictures while moving around a stationary object by iphone 14 i use foreground masks in feature extraction process of COLMAP to restrict the sparse point cloud to foreground object i do refine bundle adjustment to refine focal length and principle point i bitwise_and image and masks so the inputs are also masked images(without Background) i use colmap2mvsnet to create cam.txt and pair.txt for ground truth depthmap, i used rendered depthmap from agisoft metashape(gray image[0-255], then i use minmaxscaler from sklearn to scale it between [depthmin and depthmax] per image(resulted from cam.txt files) [i wanted to use geometric depthmap files from colmap but depthrange is varying from negative to positive values in a large range of numbers, and i dont know i should use these depthmin and depthmax or those from cam.txt file] are these steps correct?i tried to train the network, loss is decreasing but not so much. but after saving checkpoint, when i want to evaluate, i dont get a good reconstructed output. can you help me how can get the good result for my problem? thank you so much

maybeLx commented 4 months ago

The depth range is quite important for the performance of MVS. Usually, we use colmap2mvsnet.py to calculate the depth range of each image (which is always positive). For rendered images, if you want to train your own MVS, you need to carefully choose the depth range. If you find errors in the depth map (e.g., the depth map has negative values), you can clip the rendered depth image.

maybeLx commented 4 months ago

There is another way to determine the depth range: you can use COLMAP to perform a sparse reconstruction of the scene. Visualize the scene in MeshLab (use some code in colmap2mvsnet.py to load the points3D.bin file and saving the point cloud in .ply format). Besides , you can remove any outlier point clouds in the meshlab, keeping only the object. You can then project the point cloud onto each camera view to calculate the object's depth range.

fariba87 commented 3 months ago

thank you for your advice. I am using your provided checkpoint to evaluate on my dataset.my object is a thin 4cm .2 cm7cm which is turning on a turntable. the object surface is not smooth and has some truncated cones on the surface with the depth in mm scale. i used colmap2mvsnet to provide me depth_min and depthmax. while the model can create the final point cloud reconstruction generally well, there is some ambiguity in the holes, i mean holes is filled with point cloud where the actual depth can not be determined on the cones, what is your advice for removing this ambiguity? thank you in advance