GDAOSU / MCT_NERF

Apache License 2.0
17 stars 1 forks source link

preprocess dataset #1

Open illuSION-crypto opened 11 months ago

illuSION-crypto commented 11 months ago

hello, thanks for your great work, I'm trying to train a mctnerf model, and I found that in step1_preprocess.py file, it requires sparse/scene_bbox.txt and sparse/ground_range.txt file included in dataset directory, but colmap won't generate these two file. May I wonder the format of these two file and what information should write to them ?

ninglixu commented 11 months ago

Hi, thanks for trying our method. The sample datasets will be released soon. For now, the scene_bbox.txt contains the bounding box of the input 3D point cloud which is specified in xmin,ymin,zmin,xmax,ymax,zmax in six lines. And the ground_range.txt contains the zmin and zmax of the rough ground height range.

illuSION-crypto commented 11 months ago

Thanks a lot for your reply, I created these files according to the dense point cloud generated by colmap, but after training I find that point cloud generated by mctnerf is very strange, which looks like this: image have you ever met this?

ninglixu commented 11 months ago

It is hard to debug with this information. What do the camera poses look like and can you check whether the scene_bbox is in the same coordinate as the camera pose? If you deal with aerial datasets, running colmap (without model_aligner) cannot provide geo-referenced camera poses.

apii commented 11 months ago

Hi! I'm also interested in this. Are the bounding box and ground height range somehow calculated from the colmap outputs?

ninglixu commented 11 months ago

I would say, the accepted camera poses (with sparse/dense point cloud) are in real-world coordinates (or geo-referenced, z-axis is up) and formulated into colmap format. Thus, the bounding bbox and ground height range is calculated based on sparse/dense point cloud. The camera poses can be calculated by any sfm softwares (e.g. colmap, opendronemap, agisoft metashape, contextcapture) and convert into colmap format. One concusing thing is that directly runing colmap only generates up-to-scale outputs, which means it is not geo-referenced. If you want to transform it into world-coordinates, please use model aligner. The camera pose conversion is indeed annoying. Please let me know if you have any questions.

illuSION-crypto commented 11 months ago

Hi! I'm also interested in this. Are the bounding box and ground height range somehow calculated from the colmap outputs?

I would say, the accepted camera poses (with sparse/dense point cloud) are in real-world coordinates (or geo-referenced, z-axis is up) and formulated into colmap format. Thus, the bounding bbox and ground height range is calculated based on sparse/dense point cloud. The camera poses can be calculated by any sfm softwares (e.g. colmap, opendronemap, agisoft metashape, contextcapture) and convert into colmap format. One concusing thing is that directly runing colmap only generates up-to-scale outputs, which means it is not geo-referenced. If you want to transform it into world-coordinates, please use model aligner. The camera pose conversion is indeed annoying. Please let me know if you have any questions.

That's right. I run a dense reconstruction by colmap and get the dense point cloud, then remove the wrong points and then get the coordinates of sence bbox by reading ply file and calculating min max. If camera poses are known, you can refer to colmap's documentation, running reconstruction with known poses.

apii commented 11 months ago

Ok, do you mean that in order to calculate the scene_bbox I would have to generate a mesh (=ply file) from the dense point cloud? Is there a way to calculate this from an aligned sparse model? I tried by taking the min and max values for each axis from points3d.txt (tried alingned and non-aligned) but something apparently goes wrong since the partitioner seems to put all images in my dataset to each block.