Closed sonnefred closed 1 year ago
Sorry for the above confusing layout, and my cicai_configs.py is like this, thanks.
KITTI_Recon_NSG_Car_Depth = MethodSpecification(
config=TrainerConfig(
method_name="nsg-kitti-car-depth-recon",
steps_per_eval_image=STEPS_PER_EVAL_IMAGE,
steps_per_eval_all_images=STEPS_PER_EVAL_ALL_IMAGES,
steps_per_save=STEPS_PER_SAVE,
max_num_iterations=MAX_NUM_ITERATIONS,
save_only_latest_checkpoint=False,
mixed_precision=False,
use_grad_scaler=True,
log_gradients=True,
pipeline=NSGPipelineConfig(
datamanager=NSGkittiDataManagerConfig(
dataparser=NSGkittiDataParserConfig(
use_car_latents=False,
use_depth=True,
split_setting="reconstruction",
),
train_num_rays_per_batch=4096,
eval_num_rays_per_batch=4096,
camera_optimizer=CameraOptimizerConfig(mode="off"),
),
model=SceneGraphModelConfig(
background_model=NerfactoModelConfig(),
object_model_template=NerfactoModelConfig(),
object_representation="class-wise",
object_ray_sample_strategy="remove-bg",
),
),
optimizers={
"background_model": {
"optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15),
"scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
},
"learnable_global": {
"optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15),
"scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
},
"object_model": {
"optimizer": RAdamOptimizerConfig(lr=5e-3, eps=1e-15),
"scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
},
},
# viewer=ViewerConfig(num_rays_per_chunk=1 << 15),
vis="wandb",
),
description="Neural Scene Graph implementation with vanilla-NeRF model for backgruond and object models.",
)
If you want to use monocular depth estimation for KITTI, please add mono_depth_loss_mult
in the SceneGraphModelConfig
. You can also modify the parameters yourself.
```python KITTI_Recon_NSG_Car_Depth = MethodSpecification( config=TrainerConfig( method_name="nsg-kitti-car-depth-recon", steps_per_eval_image=STEPS_PER_EVAL_IMAGE, steps_per_eval_all_images=STEPS_PER_EVAL_ALL_IMAGES, steps_per_save=STEPS_PER_SAVE, max_num_iterations=MAX_NUM_ITERATIONS, save_only_latest_checkpoint=False, mixed_precision=False, use_grad_scaler=True, log_gradients=True, pipeline=NSGPipelineConfig( datamanager=NSGkittiDataManagerConfig( dataparser=NSGkittiDataParserConfig( scale_factor=0.01, use_car_latents=False, use_depth=True, split_setting="reconstruction", ), train_num_rays_per_batch=4096, eval_num_rays_per_batch=4096, camera_optimizer=CameraOptimizerConfig(mode="off"), ), model=SceneGraphModelConfig( mono_depth_loss_mult=0.05, depth_loss_mult=0, background_model=NerfactoModelConfig(), object_model_template=NerfactoModelConfig(), object_representation="class-wise", object_ray_sample_strategy="remove-bg", ), ), optimizers={ "background_model": { "optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15), "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000), }, "learnable_global": { "optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15), "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000), }, "object_model": { "optimizer": RAdamOptimizerConfig(lr=5e-3, eps=1e-15), "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000), }, }, # viewer=ViewerConfig(num_rays_per_chunk=1 << 15), vis="wandb", ), description="Neural Scene Graph implementation with vanilla-NeRF model for backgruond and object models.", ) ```
Thanks for your quick reply, and I also have a question: when I use cicai_render.py to render images or videos, what should I modify if I only want to render the background (remove the objects)? Thanks!
Thanks for your quick reply, and I also have a question: when I use cicai_render.py to render images or videos, what should I modify if I only want to render the background (remove the objects)? Thanks!
Hi! You can refer to #33, maybe it's helpful to you.
Ok, btw, I generated depth maps using monocular depth estimation model, and I put the visualization images, which are 3-channel, into the completion_02 folder, is there any required processing before putting them in the folder? Thanks
Below is an example image that we generated with a monocular depth estimation model.
So your depth map is one-channel? Did you transfer the 3-channel depth map generated from the model to one channel?
But the depth map I generated is a color map, not black and white, should I transfer it into grayscale?
Hi! You can refer to the below code, which reads our depth from the image. And you need to transfer your image into grayscale.
I train the kitti 0006 without depth ,and the objects in the scene are just some shadow. Dose it mean without the depth the result will be bad .
Hi! You can refer to the below code, which reads our depth from the image. And you need to transfer your image into grayscale.
ok, thanks a lot, i will have a look.
I train the kitti 0006 without depth ,and the objects in the scene are just some shadow. Dose it mean without the depth the result will be bad .
You can try with our proposed category-level car model. That will help to decouple the object and background.
Dose it mean without the depth the result will be bad
Sure.
I train the kitti 0006 without depth ,and the objects in the scene are just some shadow. Dose it mean without the depth the result will be bad .
You can try with our proposed category-level car model. That will help to decouple the object and background.
Dose it mean without the depth the result will be bad
Sure.
Thank you for your reply. Is the below model the category-level car model ?
model=SceneGraphModelConfig( background_model=NerfactoModelConfig(), object_model_template=CarNeRFModelConfig(_target=CarNeRF), object_representation="class-wise", object_ray_sample_strategy="remove-bg", ),
Below is an example image that we generated with a monocular depth estimation model.
I follow the omnidata to generate the depth ,and I notice the channel is setted to be 1 .While the result I got is still 3-channel. What's the problem?
> > > I train the kitti 0006 without depth ,and the objects in the scene are just some shadow. Dose it mean without the depth the result will be bad . ![00106](https://user-images.githubusercontent.com/74552396/259690764-8110223b-a09b-4d3d-8293-ba115160a25e.png) > > > > > > You can try with our proposed category-level car model. That will help to decouple the object and background. > > > Dose it mean without the depth the result will be bad > > > > > > Sure. > > Thank you for your reply. Is the below model the category-level car model ? > > > model=SceneGraphModelConfig( > > background_model=NerfactoModelConfig(), > > object_model_template=CarNeRFModelConfig(_target=CarNeRF), > > object_representation="class-wise", > > object_ray_sample_strategy="remove-bg", > > ),
Sure.
@zwlvd Hi. Thanks for your reply. You can change as the below image and try again.
For more information about KITTI depth maps, you all can refer to #18.
For more information about KITTI depth maps, you all can refer to #18.
Thank you for your valuable suggestions, it's very useful.
Hi, I trained the model using monocular depth, but the depth loss is like this, which didn't decrease stably, and the eval depth image is like the following. Could you please point out what the problem may be? Thanks.
Hi, I trained the model using monocular depth, but the depth loss is like this, which didn't decrease stably, and the eval depth image is like the following. Could you please point out what the problem may be? Thanks.
Hi! What multipliers do you apply to the mono_depth_loss and the general depth_loss?
Hi, I trained the model using monocular depth, but the depth loss is like this, which didn't decrease stably, and the eval depth image is like the following. Could you please point out what the problem may be? Thanks.
Hi! What multipliers do you apply to the mono_depth_loss and the general depth_loss?
I trained the model using this config.
KITTI_Recon_NSG_Car_Depth = MethodSpecification(
config=TrainerConfig(
method_name="nsg-kitti-car-depth-recon",
steps_per_eval_image=STEPS_PER_EVAL_IMAGE,
steps_per_eval_all_images=STEPS_PER_EVAL_ALL_IMAGES,
steps_per_save=STEPS_PER_SAVE,
max_num_iterations=MAX_NUM_ITERATIONS,
save_only_latest_checkpoint=False,
mixed_precision=False,
use_grad_scaler=True,
log_gradients=True,
pipeline=NSGPipelineConfig(
datamanager=NSGkittiDataManagerConfig(
dataparser=NSGkittiDataParserConfig(
scale_factor=0.01,
use_car_latents=False,
use_depth=True,
split_setting="reconstruction",
),
train_num_rays_per_batch=4096,
eval_num_rays_per_batch=4096,
camera_optimizer=CameraOptimizerConfig(mode="off"),
),
model=SceneGraphModelConfig(
mono_depth_loss_mult=0.05,
depth_loss_mult=0,
background_model=NerfactoModelConfig(),
object_model_template=NerfactoModelConfig(),
object_representation="class-wise",
object_ray_sample_strategy="remove-bg",
),
),
optimizers={
"background_model": {
"optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15),
"scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
},
"learnable_global": {
"optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15),
"scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
},
"object_model": {
"optimizer": RAdamOptimizerConfig(lr=5e-3, eps=1e-15),
"scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
},
},
# viewer=ViewerConfig(num_rays_per_chunk=1 << 15),
vis="wandb",
),
description="Neural Scene Graph implementation with vanilla-NeRF model for backgruond and object models.",
)
Hi, I trained the model using monocular depth, but the depth loss is like this, which didn't decrease stably, and the eval depth image is like the following. Could you please point out what the problem may be? Thanks.
Hi! What multipliers do you apply to the mono_depth_loss and the general depth_loss?
I trained the model using this config.
KITTI_Recon_NSG_Car_Depth = MethodSpecification( config=TrainerConfig( method_name="nsg-kitti-car-depth-recon", steps_per_eval_image=STEPS_PER_EVAL_IMAGE, steps_per_eval_all_images=STEPS_PER_EVAL_ALL_IMAGES, steps_per_save=STEPS_PER_SAVE, max_num_iterations=MAX_NUM_ITERATIONS, save_only_latest_checkpoint=False, mixed_precision=False, use_grad_scaler=True, log_gradients=True, pipeline=NSGPipelineConfig( datamanager=NSGkittiDataManagerConfig( dataparser=NSGkittiDataParserConfig( scale_factor=0.01, use_car_latents=False, use_depth=True, split_setting="reconstruction", ), train_num_rays_per_batch=4096, eval_num_rays_per_batch=4096, camera_optimizer=CameraOptimizerConfig(mode="off"), ), model=SceneGraphModelConfig( mono_depth_loss_mult=0.05, depth_loss_mult=0, background_model=NerfactoModelConfig(), object_model_template=NerfactoModelConfig(), object_representation="class-wise", object_ray_sample_strategy="remove-bg", ), ), optimizers={ "background_model": { "optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15), "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000), }, "learnable_global": { "optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15), "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000), }, "object_model": { "optimizer": RAdamOptimizerConfig(lr=5e-3, eps=1e-15), "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000), }, }, # viewer=ViewerConfig(num_rays_per_chunk=1 << 15), vis="wandb", ), description="Neural Scene Graph implementation with vanilla-NeRF model for backgruond and object models.", )
Hi, do you have any suggestion about this problem, I'm a bit confused. When I used monocular depth loss, the training result is even worse than without depth supervision ... Thanks in advance.
Hi, we think there's a visualization problem with the depth colormap. Could you please check the values of the predicted depths?
Hi, we think there's a visualization problem with the depth colormap. Could you please check the values of the predicted depths?
Hi, I checked the values of the predicted depths. I generated the depth map following these codes, and the generated map is a 3-channel black-and-white picture and the pixel value is between 0-255. Is it right?
Hi, I saw this part in the code, does this mean the depth image should be a one-channel image? And what range of the pixel values should be? Thanks.
Hi, I saw this part in the code, does this mean the depth image should be a one-channel image? And what range of the pixel values should be? Thanks.
whatever format the depth maps you load will be transformed into a single-channel float tensor
Hi, I saw this part in the code, does this mean the depth image should be a one-channel image? And what range of the pixel values should be? Thanks.
whatever format the depth maps you load will be transformed into a single-channel float tensor
Ok, thanks a lot
@sonnefred Hi, I loaded the depth map as a single-channel float tensor, but I still have the same problem, the mono depth loss won't go down, do you have any solution for this?
Below is an example image that we generated with a monocular depth estimation model.
I follow the omnidata to generate the depth ,and I notice the channel is setted to be 1 .While the result I got is still 3-channel. What's the problem?
Same case here. My depth maps are 3-channel as well. Should you change the code anywhere or can you start training with depth maps being 3-channel?
Hi, I'd like to train a model from scratch using depth supervision generated from the monocular depth estimation model, and my cicai_config.py is like this, is it right? Thanks!