Open MulinYu opened 1 year ago
Here is the results:
Hi @MulinYu
Thank you for your interest in the project!
we recently found a bug in T&T preprocessing. It was fixed in https://github.com/NVlabs/neuralangelo/commit/62483febb00c9ad4b8a17884abf9191acc6cafa9. Can you check if this fixes your issue?
Hi @mli0603
I don't think the fix works, as bounding_box
only affects the aabb_range
in transforms.json
, which is not used during training.
With the latest commit I tried to train the Meetingroom and got similar result as above: from left to right: gt, render, normal, depth. This is the evaluation reslut at iteration 80k
I followed #14 to setup the hyperparameters using grad_accum_iter
, which should match the experiment in the paper:
_parent_: projects/neuralangelo/configs/base.yaml
max_iter: 4000000
wandb_scalar_iter: 800
wandb_image_iter: 80000
validation_iter: 40000
model:
object:
sdf:
mlp:
inside_out: True # True for Meetingroom.
encoding:
coarse2fine:
init_active_level: 8
step: 40000
appear_embed:
enabled: True
dim: 8
data:
type: projects.neuralangelo.data
root: .../data/TNT/Meetingroom
num_images: 371 # The number of training images.
train:
image_size: [835,1500]
batch_size: 16
subset:
val:
image_size: [300,540]
batch_size: 16
subset: 16
max_viz_samples: 16
trainer:
grad_accum_iter: 8
optim:
sched:
warm_up_end: 40000
two_steps: [2400000,3200000]
Do you have any intuition on how the result should look like at this iteration and what could be wrong with my experiment?
It seems that the bounding sphere has a wrong size in my case: Will try to fix that and update with the result. Thanks for the hint in the latest README :)
Hi @DerKleineLi
Good news! Glad you have found the problem.
However, I wonder if you used preprocess_tnt.sh
to generate the bounding regions. I would expect the bounding region to work fairly well for T&T, otherwise it is a bug and I need to fix it.
Thanks for the comment @mli0603!
For Meetingroom I manually downloaded the images and colmap reconstruction from the official release of TNT, placed the file as described in DATA_PROCESSING.md
, and ran
python projects/neuralangelo/scripts/convert_tnt_to_json.py --tnt_path .../data/TNT
For the visualization I ran projects/neuralangelo/scripts/visualize_colmap.ipynb
with
colmap_path = ".../data/TNT/Meetingroom"
json_fname = f"{colmap_path}/transforms.json"
Hope it helps!
Hi @DerKleineLi
If you look at the notebook, there is a read_scale=0.25
which changes the visualization. If you set read_scale=1.0
, then it will use the default bounding radius.
There is no bug in terms of pre-processing and I think your issue is somewhere else.
Hi, I can confirm the observed degraded results.
Sofar I have trained the Neuralangelo model on Barn and Courthouse scenes. Training logs on Wandb can be accessed here: https://api.wandb.ai/links/willyzw/whf9tz8o
I mostly followed the default configeration with the minor change to batch size: 1 for Courthouse, and 2 for Barn. As I thought this may have some impace. Any insights would be helpful!
Hi @MulinYu
Thank you for your interest in the project!
we recently found a bug in T&T preprocessing. It was fixed in 62483fe. Can you check if this fixes your issue?
Hello,
I've utilized the updated script to produce a new JSON file for the Barn and subsequently retrained it using a batch size of 16. However, the outcome is still unsatisfactory after 70,000 iterations. Could you pleaes share your JSON files for TNT? If I continue to obtain bad results using your JSON files, it would indicate that the preprocessing stage is not the source of the issue.
Best, Mulin
Thanks for this info. We are looking into this issue. We will update once we pin down the error.
@DerKleineLi @mli0603 Some updates for Meeting Room training results at 60K steps. It's :
I use BlenderNeuralangelo to manually regenerate the bounding_box and sphere. I have also modified the ray_num from 512 to 2048 to accelerate the training.
@hugoycj Amazing result! Could you share more details about your pipeline? Did you use the official colmap reconstruction of TNT? How large is the bounding box? Have you modified other hyperparameters? Thank you!
Yeap. I follow the same steps in provided convert_tnt_to_json.py scripts, followed by calculating bounding box using Blender. One interesting thing I found for Meetingroom scene is the sparse point clouds are too noisy which make the auto-generated bounding box around 2~3 times larger than the real scene, which may make it difficult to converge I guess.
Here are the updated params in transforms.json under datasets/tanks_and_temples/Meetingroom dir. I think you could replace these params directly in your JSON file and visualize it again.
"aabb_scale": 16.0,
"aabb_range": [
[
-7.6214215713955715,
6.522051862701991
],
[
-3.6237776073426597,
4.891977370864412
],
[
-11.059166537834281,
9.311011724588386
]
],
"sphere_center": [
-0.5496848543467903,
0.6340998817608761,
-0.8740774066229475
],
"sphere_radius": 11.449168401589803
I have also tried the colmap scripts using my own data. The results are stills far from good maybe due to the training time, but we could observe some fine detailed region is generated.
One simple suggestion for training on custom data: try to capture video in ultrawide mode, this could may the pose estimation more accurate and may be beneficial to neuralangelo training.
@mli0603
I am curious about the dramatic drop in PSNR during 20k and 30k when training on MeetingRoom. Do you have any idea why this happen?
@mli0603 @hugoycj Thank you for your help. I think I found the issue:
The generated transforms.json
by the convert_tnt_to_json.py
points the file path to the uncalibrated images. Instead of "file_path": "images/xxx.jpg"
, it should be "file_path": "dense/images/xxx.jpg"
Having this fixed I'm getting good result at iteration 15k with batch size 16, grad_accum_iter=1 on 1 gpu:
@DerKleineLi thanks for this! It's indeed a bug on our side. We will fix it in the coming days.
@mli0603 @hugoycj Thank you for your help. I think I found the issue: The generated
transforms.json
by theconvert_tnt_to_json.py
points the file path to the uncalibrated images. Instead of"file_path": "images/xxx.jpg"
, it should be"file_path": "dense/images/xxx.jpg"
Having this fixed I'm getting good result at iteration 15k with batch size 16, grad_accum_iter=1 on 1 gpu:
Dear @DerKleineLi
You are right, the generated json file let the dataloader read the images before undistortion. I also get better results on Barn after changing the image path to 'dense/images'.
Thanks alot. Best. Mulin
I have also tried the colmap scripts using my own data. The results are stills far from good maybe due to the training time, but we could observe some fine detailed region is generated.
Hi @hugoycj,
May I ask if you get good results on your own data? I also tried with one sequence randomly captured with an iPhone but the results seem to be worse than what I can get from MonoSDF.
Update: we have updated the data structure of the COLMAP artifacts in the latest main
branch to avoid confusion. We are now storing the raw images in images_raw
and the undistorted images in images
(the final images for Neuralangelo). The dense
folder has also been removed. Please see the new data preparation for details on the new structure.
The tanks and temples preprocessing script has been updated (see https://github.com/NVlabs/neuralangelo/commit/54172deb11660fd6c64d727e600f0201b002347a) to fix the above bug and reflect the latest changes of data preprocessing.
Hi all, I still need some help with the TNT dataset. Here's the result I get using the latest commit: It's much blurrier than the paper's result. If anyone have reached a better result, I would appreciate it if you share me your settings and results. Thanks in advance!
@hugoycj I wonder if you have trained the Meeetingroom till the end, how was the result? I also followed your pipeline a bit and here's what I got at iteration 60k: Here's the config:
_parent_: projects/neuralangelo/configs/base.yaml
model:
object:
sdf:
mlp:
inside_out: True # True for Meetingroom.
encoding:
coarse2fine:
init_active_level: 8
appear_embed:
enabled: True
dim: 8
render:
rand_rays: 2048
data:
type: projects.neuralangelo.data
root: .../data/TNT/Meetingroom
num_images: 371 # The number of training images.
train:
image_size: [835,1500]
batch_size: 1
subset:
val:
image_size: [300,540]
batch_size: 1
subset: 1
max_viz_samples: 16
The result is very different from yours. Could you check whether the config fits yours? If so, the problem could lie in the data preprocessing, could you then share the complete transforms.json
file? Only the global parameters are not enough because as you change the bbox, the transform of each frame would also change.
Another question regarding the blender plugin: There are limits as we change the size of the bbox. The image shows the smallest possible bbox of the plugin. I wonder if this is the same as in your case and have you edited the script to allow smaller bboxes?
Dear authors,
Thanks for relaesing the code. I ran the code with the given tnt confige on Barn (processed folowing the guide) but the result is bad and far away from which shown in the paper. Can you please give me some advices or share the config that you used?
Thnaks alot, Best. Mulin