Closed szhang963 closed 9 months ago
To my understanding, the module is disabled by default in their work.
Thank you. I want to ask another questions. Did you meet the problem of background blur in private dataset, and can not separate the dynamic objects and background? I didn't using depth and semantic mask.
I met similar problem, caused by wrong bbox position(wrong coordinate). Maybe you can check if your bbox pos is correct.
@Nplace-su Thank you for your reply. Can you tell me where to check the bbox pose, and what is the process for the bbox check? through checking the visible_objects_ls by visualivation?
From the train results, the bbox pose seems right? but the result of objects_rgb is very strange. could you provide me some help, very grateful.
@szhang963 It seems not 100% right? There are lots of potential issues when you use your own data, I suggest you make sure your coordinate and other data conventions 100% align with their original dataparsers. BTW, could you share your training config?
Thank you for your patient reply. This is my config.yaml:
!!python/object:nerfstudio.engine.trainer.TrainerConfig
_target: !!python/name:nerfstudio.engine.trainer.Trainer ''
data: &id003 !!python/object/apply:pathlib.PosixPath
- data
- my_kitti
- training
- image_02
- '0001'
experiment_name: KITTI_my_Recon_Mars_focal_cxy_boxscal_camdebug_noscale-001
gradient_accumulation_steps: 1
load_checkpoint: null
load_config: null
load_dir: null
load_scheduler: true
load_step: null
log_gradients: true
logging: !!python/object:nerfstudio.configs.base_config.LoggingConfig
local_writer: !!python/object:nerfstudio.configs.base_config.LocalWriterConfig
_target: !!python/name:nerfstudio.utils.writer.LocalWriter ''
enable: true
max_log_size: 10
stats_to_track: !!python/tuple
- !!python/object/apply:nerfstudio.utils.writer.EventName
- Train Iter (time)
- !!python/object/apply:nerfstudio.utils.writer.EventName
- Train Rays / Sec
- !!python/object/apply:nerfstudio.utils.writer.EventName
- Test PSNR
- !!python/object/apply:nerfstudio.utils.writer.EventName
- Vis Rays / Sec
- !!python/object/apply:nerfstudio.utils.writer.EventName
- Test Rays / Sec
- !!python/object/apply:nerfstudio.utils.writer.EventName
- ETA (time)
max_buffer_size: 20
profiler: basic
relative_log_dir: !!python/object/apply:pathlib.PosixPath []
steps_per_log: 10
machine: !!python/object:nerfstudio.configs.base_config.MachineConfig
device_type: cuda
dist_url: auto
machine_rank: 0
num_devices: 1
num_machines: 1
seed: 42
max_num_iterations: 100000
method_name: KITTI_my_Recon_Mars_focal_cxy_boxscal_camdebug_noscale
mixed_precision: false
optimizers:
background_model:
optimizer: !!python/object:nerfstudio.engine.optimizers.RAdamOptimizerConfig
_target: &id001 !!python/name:torch.optim.radam.RAdam ''
eps: 1.0e-15
lr: 0.001
max_norm: null
weight_decay: 0
scheduler: !!python/object:nerfstudio.engine.schedulers.ExponentialDecaySchedulerConfig
_target: &id002 !!python/name:nerfstudio.engine.schedulers.ExponentialDecayScheduler ''
lr_final: 1.0e-05
lr_pre_warmup: 1.0e-08
max_steps: 200000
ramp: cosine
warmup_steps: 0
learnable_global:
optimizer: !!python/object:nerfstudio.engine.optimizers.RAdamOptimizerConfig
_target: *id001
eps: 1.0e-15
lr: 0.001
max_norm: null
weight_decay: 0
scheduler: !!python/object:nerfstudio.engine.schedulers.ExponentialDecaySchedulerConfig
_target: *id002
lr_final: 1.0e-05
lr_pre_warmup: 1.0e-08
max_steps: 200000
ramp: cosine
warmup_steps: 0
object_model:
optimizer: !!python/object:nerfstudio.engine.optimizers.RAdamOptimizerConfig
_target: *id001
eps: 1.0e-15
lr: 0.005
max_norm: null
weight_decay: 0
scheduler: !!python/object:nerfstudio.engine.schedulers.ExponentialDecaySchedulerConfig
_target: *id002
lr_final: 1.0e-05
lr_pre_warmup: 1.0e-08
max_steps: 200000
ramp: cosine
warmup_steps: 0
output_dir: !!python/object/apply:pathlib.PosixPath
- work_dirs
pipeline: !!python/object:mars.mars_pipeline.MarsPipelineConfig
_target: !!python/name:mars.mars_pipeline.MarsPipeline ''
datamanager: !!python/object:mars.data.mars_datamanager.MarsDataManagerConfig
_target: !!python/name:mars.data.mars_datamanager.MarsDataManager ''
camera_optimizer: null
camera_res_scale_factor: 1.0
collate_fn: !!python/name:nerfstudio.data.utils.nerfstudio_collate.nerfstudio_collate ''
data: *id003
dataparser: !!python/object:mars.data.mars_kitti_dataparser_phi.MarsKittiDataParserConfig
_target: !!python/name:mars.data.mars_kitti_dataparser_phi.MarsKittiParser ''
add_input_rows: -1
alpha_color: white
bckg_only: false
box_scale: 1.0
car_nerf_state_dict_path: !!python/object/apply:pathlib.PosixPath
- pretrain
- car_nerf
- car_nerf.ckpt
car_object_latents_path: !!python/object/apply:pathlib.PosixPath
- pretrain
- car_nerf
- latent_codes.pt
chunk: 32768
data: !!python/object/apply:pathlib.PosixPath
- data
- kitti
- training
- image_02
- '0006'
dataset_type: kitti
far_plane: 150.0
first_frame: 0
last_frame: 50
max_input_objects: -1
near_plane: 0.5
netchunk: 65536
novel_view: left
obj_only: false
obj_opaque: true
object_setting: 0
render_only: false
scale_factor: 1
scene_scale: 1.0
semantic_mask_classes: []
semantic_path: !!python/object/apply:pathlib.PosixPath []
split_setting: reconstruction
use_car_latents: false
use_depth: false
use_obj: true
use_object_properties: true
use_semantic: false
eval_image_indices: !!python/tuple
- 0
eval_num_images_to_sample_from: -1
eval_num_rays_per_batch: 8192
eval_num_times_to_repeat_images: -1
images_on_gpu: false
masks_on_gpu: false
patch_size: 1
pixel_sampler: !!python/object:nerfstudio.data.pixel_samplers.PixelSamplerConfig
_target: !!python/name:nerfstudio.data.pixel_samplers.PixelSampler ''
is_equirectangular: false
keep_full_image: false
num_rays_per_batch: 4096
train_num_images_to_sample_from: -1
train_num_rays_per_batch: 8192
train_num_times_to_repeat_images: -1
model: !!python/object:mars.models.scene_graph.SceneGraphModelConfig
_target: !!python/name:mars.models.scene_graph.SceneGraphModel ''
background_color: black
background_model: !!python/object:mars.models.nerfacto.NerfactoModelConfig
_target: &id004 !!python/name:mars.models.nerfacto.NerfactoModel ''
appearance_embed_dim: 32
background_color: black
base_res: 16
collider_params:
far_plane: 6.0
near_plane: 2.0
disable_scene_contraction: false
distortion_loss_mult: 0.002
enable_collider: true
eval_num_rays_per_chunk: 4096
far_plane: 150.0
features_per_level: 2
hidden_dim: 64
hidden_dim_color: 64
hidden_dim_transient: 64
implementation: tcnn
interlevel_loss_mult: 1.0
log2_hashmap_size: 19
loss_coefficients:
rgb_loss_coarse: 1.0
rgb_loss_fine: 1.0
max_res: 2048
near_plane: 0.05
num_levels: 16
num_nerf_samples_per_ray: 97
num_proposal_iterations: 2
num_proposal_samples_per_ray: &id005 !!python/tuple
- 256
- 128
obj_feat_dim: 0
orientation_loss_mult: 0.0001
pred_normal_loss_mult: 0.001
predict_normals: false
prompt: null
proposal_initial_sampler: piecewise
proposal_net_args_list:
- hidden_dim: 16
log2_hashmap_size: 17
max_res: 128
num_levels: 5
use_linear: false
- hidden_dim: 16
log2_hashmap_size: 17
max_res: 256
num_levels: 5
use_linear: false
proposal_update_every: 5
proposal_warmup: 5000
proposal_weights_anneal_max_num_iters: 1000
proposal_weights_anneal_slope: 10.0
use_average_appearance_embedding: true
use_gradient_scaling: false
use_proposal_weight_anneal: true
use_same_proposal_network: false
use_single_jitter: true
collider_params:
far_plane: 6.0
near_plane: 2.0
debug_object_pose: false
depth_loss_mult: 0.01
depth_loss_type: !!python/object/apply:nerfstudio.model_components.losses.DepthLossType
- 1
depth_sigma: 0.05
enable_collider: true
eval_num_rays_per_chunk: 4096
far_plane: 1000.0
interlevel_loss_mult: 1.0
is_euclidean_depth: false
latent_size: 256
loss_coefficients:
rgb_loss_coarse: 1.0
rgb_loss_fine: 1.0
max_num_obj: -1
mono_depth_loss_mult: 0.0
near_plane: 0.05
object_model_template: !!python/object:mars.models.nerfacto.NerfactoModelConfig
_target: *id004
appearance_embed_dim: 32
background_color: black
base_res: 16
collider_params:
far_plane: 6.0
near_plane: 2.0
disable_scene_contraction: false
distortion_loss_mult: 0.002
enable_collider: true
eval_num_rays_per_chunk: 4096
far_plane: 150.0
features_per_level: 2
hidden_dim: 64
hidden_dim_color: 64
hidden_dim_transient: 64
implementation: tcnn
interlevel_loss_mult: 1.0
log2_hashmap_size: 19
loss_coefficients:
rgb_loss_coarse: 1.0
rgb_loss_fine: 1.0
max_res: 2048
near_plane: 0.05
num_levels: 16
num_nerf_samples_per_ray: 97
num_proposal_iterations: 2
num_proposal_samples_per_ray: *id005
obj_feat_dim: 0
orientation_loss_mult: 0.0001
pred_normal_loss_mult: 0.001
predict_normals: false
prompt: null
proposal_initial_sampler: piecewise
proposal_net_args_list:
- hidden_dim: 16
log2_hashmap_size: 17
max_res: 128
num_levels: 5
use_linear: false
- hidden_dim: 16
log2_hashmap_size: 17
max_res: 256
num_levels: 5
use_linear: false
proposal_update_every: 5
proposal_warmup: 5000
proposal_weights_anneal_max_num_iters: 1000
proposal_weights_anneal_slope: 10.0
use_average_appearance_embedding: true
use_gradient_scaling: false
use_proposal_weight_anneal: true
use_same_proposal_network: false
use_single_jitter: true
object_ray_sample_strategy: remove-bg
object_representation: class-wise
object_warmup_steps: 1000
orientation_loss_mult: 0.0001
pred_normal_loss_mult: 0.001
predict_normals: false
prompt: null
ray_add_input_rows: -1
semantic_loss_mult: 1.0
should_decay_sigma: false
sigma_decay_rate: 0.9998
sky_model: !!python/object:mars.models.sky_model.SkyModelConfig
_target: !!python/name:mars.models.sky_model.SkyModel ''
collider_params:
far_plane: 6.0
near_plane: 2.0
enable_collider: true
eval_num_rays_per_chunk: 4096
hidden_dim: 128
loss_coefficients:
rgb_loss_coarse: 1.0
rgb_loss_fine: 1.0
num_layers: 5
prompt: null
starting_depth_sigma: 4.0
use_interlevel_loss: true
use_sky_model: false
project_name: nerfstudio-project
prompt: null
relative_model_dir: !!python/object/apply:pathlib.PosixPath
- nerfstudio_models
save_only_latest_checkpoint: false
steps_per_eval_all_images: 5000
steps_per_eval_batch: 500
steps_per_eval_image: 500
steps_per_save: 10000
timestamp: 2023-12-24_200822
use_grad_scaler: true
viewer: !!python/object:nerfstudio.configs.base_config.ViewerConfig
camera_frustum_scale: 0.1
default_composite_depth: true
image_format: jpeg
jpeg_quality: 90
make_share_url: false
max_num_display_images: 512
num_rays_per_chunk: 32768
quit_on_train_completion: false
relative_log_filename: viewer_log_filename.txt
websocket_host: 0.0.0.0
websocket_port: null
websocket_port_default: 7007
vis: wandb
I convert my data to kitti format, and it is right and unshifted for projecting 3d bbox onto image. So, I am confused for the align in train result. Thank you for your help.
here's a tool for visualizing your camera & obj poses that may be helpful: https://github.com/wuzirui/mars_pose_visualizer/ you wanna make sure your camera & obj coord axes is the same as the KITTI/VKITTI's.
p.s. if you have multiple issues (next time), please raise them separately, so that others can refer :)
@wuzirui Than you a lot. I am trying the check of camera & obj coord axes using the visualizer. Sorry for asking another problem in the same issue, and I will be careful for it.
hi, all. I visualized the camera & obj pose in the same coord system. It seems a little error in Y axis?
I found the conversion of camrect2cami is by matmul the inverse intrinsic to the translation of camrect2img (i.e. P), and the rotation is unit matrix. But in my dataset, the P is obtained by `np.matmul(np.matmul(intrinsic, ego2cam[:3, :]), np.linalg.inv(mycoord2kitticam))`, then I used the same operation as following to get the camrect2cam_i. Is it correct for my data format?
# Get camera Poses camare id: 02, 03
for cam_i in range(2):
transformation = np.eye(4)
projection = tracking_calibration["P" + str(cam_i + 2)] # rectified camera coordinate system -> image
K_inv = np.linalg.inv(projection[:3, :3])
R_t = projection[:3, 3]
t_crect2c = np.matmul(K_inv, R_t)
# t_crect2c = 1./projection[[0, 1, 2],[0, 1, 2]] * projection[:, 3]
transformation[:3, 3] = R_t
tracking_calibration["Tr_camrect2cam0" + str(cam_i + 2)] = transformation
Thank you for your reply.
Bbox pos in your data stands for the center of the 3d bbox, or center of bottom face of 3D bbox? @szhang963
@Nplace-su It is in the center of the 3d bbox, but I have converted it to kitti by
pos = np.matmul(mycoord2kitticam[:3,:3], pos)
pos[1] += h/2
hi, @Nplace-su . I fix a bug in Tr_camrect2cam_i, the objets can be learned, but the backgroud and the objects still can not be separate? what causes it?
hi, all. I visualized the camera & obj pose in the same coord system. It seems a little error in Y axis?
I found the conversion of camrect2cami is by matmul the inverse intrinsic to the translation of camrect2img (i.e. P), and the rotation is unit matrix. But in my dataset, the P is obtained by `np.matmul(np.matmul(intrinsic, ego2cam[:3, :]), np.linalg.inv(mycoord2kitticam))`, then I used the same operation as following to get the camrect2cam_i. Is it correct for my data format?
# Get camera Poses camare id: 02, 03 for cam_i in range(2): transformation = np.eye(4) projection = tracking_calibration["P" + str(cam_i + 2)] # rectified camera coordinate system -> image K_inv = np.linalg.inv(projection[:3, :3]) R_t = projection[:3, 3] t_crect2c = np.matmul(K_inv, R_t) # t_crect2c = 1./projection[[0, 1, 2],[0, 1, 2]] * projection[:, 3] transformation[:3, 3] = R_t tracking_calibration["Tr_camrect2cam0" + str(cam_i + 2)] = transformation
Thank you for your reply.
the camera 0 in the first image has the x-axis pointing down(?), which should be pointing to its right, maybe you can check your axes systems?
hi, @Nplace-su . I fix a bug in Tr_camrect2cam_i, the objets can be learned, but the backgroud and the objects still can not be separate? what causes it?
these beam-like artifacts in the background image usually indicate that your scale factor is not good, maybe scaling it down would help
@wuzirui Thank you for your reply. The initial kitti data also has the same coord system (x-axis down).
i mean pointing right (relative) to the image is preferred. if your camera & world system aligns with this, it should be correct~
Thank you, Iwill check it. But it seems the Y-axis and z-axis reversed?
Thank you, Iwill check it. But it seems the Y-axis and z-axis reversed?
from this figure, your camera coordinate does not align with your world coordinate(object coordinate), maybe you need to flip them
OK I will check it. Thank you a lot.
Hi, @wuzirui , I fixed the scale_factor from 1.0 to 0.1, and the background and objects are separated. What is the reason?Could you introduce the principle? Thanks a lot.
The result exists still some ghost shadow as the figure. Could you provide some solutions to solve it?
hello, when I use the kitti dataset train mars, I met the problem. When I remove the camera_optimizer in the config, I found the result is normal? Therefore, the module is necessary? Thanks a lot.