OPEN-AIR-SUN / mars

MARS: An Instance-aware, Modular and Realistic Simulator for Autonomous Driving
Apache License 2.0
680 stars 64 forks source link

just train background with no object #69

Closed Alexanderisgod closed 1 year ago

Alexanderisgod commented 1 year ago

hi, author, i just want to train the background without object.

VKITTI_NVS_NSG_Car_Depth = MethodSpecification(
    config=TrainerConfig(
        method_name="nsg-vkitti-car-depth-nvs",
        steps_per_eval_image=STEPS_PER_EVAL_IMAGE,
        steps_per_eval_all_images=STEPS_PER_EVAL_ALL_IMAGES,
        steps_per_save=STEPS_PER_SAVE,
        save_only_latest_checkpoint=False,
        max_num_iterations=MAX_NUM_ITERATIONS,
        mixed_precision=False,
        use_grad_scaler=True,
        log_gradients=True,
        pipeline=NSGPipelineConfig(
            datamanager=NSGkittiDataManagerConfig(
                dataparser=NSGvkittiDataParserConfig(
                    use_car_latents=True,
                    use_depth=True,
                    car_object_latents_path=Path("/data/datasets/VKITTI2/car_nerfs/latents/latent_codes06.pt"),
                    split_setting="nvs-75",
                    car_nerf_state_dict_path=Path("/data/datasets/VKITTI2/car_nerfs/state_dict/epoch_805.ckpt"),
                ),
                train_num_rays_per_batch=RAYS_PER_BATCH,
                eval_num_rays_per_batch=RAYS_PER_BATCH,
                camera_optimizer=CameraOptimizerConfig(mode="off"),
            ),
            model=SceneGraphModelConfig(
                background_model=NerfactoModelConfig(),
                object_model_template=CarNeRFModelConfig(_target=CarNeRF),
                object_representation="class-wise",
                object_ray_sample_strategy="remove-bg",
            ),
        ),
        optimizers={
            "background_model": {
                "optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15),
                "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
            },
            "object_model": {
                "optimizer": RAdamOptimizerConfig(lr=5e-3, eps=1e-15),
                "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
            },
        },
        # viewer=ViewerConfig(num_rays_per_chunk=1 << 15),
        vis="wandb",
    ),
    description="Neural Scene Graph implementation with vanilla-NeRF model for backgruond and object models.",
)

so what should i do on above config, thanks a lot.

wuzirui commented 1 year ago

do you mean just visualizing the background nodes, or you only want to optimize the background model solely?

Alexanderisgod commented 1 year ago

do you mean just visualizing the background nodes, or you only want to optimize the background model solely?

1) optimize the background model solely is the first choice 2) whether render without object nodes equals to 'visualizing the background nodes'?

wuzirui commented 1 year ago
  1. this is not supported in our current repo, but you can use a panoptic mask and only sample pixels in background region and turn off the optimizers of foreground nodes.
  2. yes, this is supported in our codebase, you can checkout the logs and there will be a background rgb channel
Alexanderisgod commented 1 year ago

thank you for your cautious reply, when i trained on my own dataset, however, it seems that 1) background rgb image is messed with car(foreground things). 2) sky edge is bulr and not aligned with your performance, i put it to 2 aspects: first, it maybe the extrinsic is not good enough? second, it's model 's limitations? image

i would appreciate if you could give me some advice, below is my config:

   config=TrainerConfig(
        method_name="nsg-wm4-car-nvs",
        steps_per_eval_image=STEPS_PER_EVAL_IMAGE,
        steps_per_eval_all_images=STEPS_PER_EVAL_ALL_IMAGES,
        steps_per_save=STEPS_PER_SAVE,
        save_only_latest_checkpoint=False,
        max_num_iterations=MAX_NUM_ITERATIONS,
        mixed_precision=False,
        use_grad_scaler=True,
        log_gradients=True,
        pipeline=NSGPipelineConfig(
            datamanager=NSGWMDataManagerConfig(
                dataparser=NSGwm4DataParserConfig(
                    # use_car_latents=True, # with no car nerf
                    use_car_latents=False,
                    use_depth=False,
                    car_object_latents_path=Path("/data/datasets/VKITTI2/car_nerfs/latents/latent_codes06.pt"),
                    split_setting="nvs-75",
                    car_nerf_state_dict_path=Path("/data/datasets/VKITTI2/car_nerfs/state_dict/epoch_805.ckpt"),
                ),
                train_num_rays_per_batch=RAYS_PER_BATCH,
                eval_num_rays_per_batch=RAYS_PER_BATCH,
                camera_optimizer=CameraOptimizerConfig(mode="off"),
            ),
            model=SceneGraphModelConfig(
                background_model=NerfactoModelConfig(),
                # object_model_template=CarNeRFModelConfig(_target=CarNeRF), # with no car nerf
                object_model_template=NerfactoModelConfig(),
                object_representation="class-wise",
                object_ray_sample_strategy="remove-bg",
            ),
        ),
        optimizers={
            "background_model": {
                "optimizer": RAdamOptimizerConfig(lr=1e-3, eps=1e-15),
                "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
            },
            "object_model": {
                "optimizer": RAdamOptimizerConfig(lr=5e-3, eps=1e-15),
                "scheduler": ExponentialDecaySchedulerConfig(lr_final=1e-5, max_steps=200000),
            },
        },
        # viewer=ViewerConfig(num_rays_per_chunk=1 << 15),
        vis="wandb",
    ),
    description="Neural Scene Graph implementation with vanilla-NeRF model for backgruond and object models.",
)
Alexanderisgod commented 1 year ago
  1. this is not supported in our current repo, but you can use a panoptic mask and only sample pixels in background region and turn off the optimizers of foreground nodes.
  2. yes, this is supported in our codebase, you can checkout the logs and there will be a background rgb channel

unfortunately, the background is messed with foregound if without car latents. image

wuzirui commented 1 year ago

have you tried tuning the scale_factor?

Alexanderisgod commented 1 year ago

scale_factor is useful for background or blur? ok, i will try it. Thank you. but i don't know what lead the mess of background and foreground, and the car latents is necessarily needed?

Alexanderisgod commented 1 year ago

have you tried tuning the scale_factor?

hi, dear author, according to your previous advice, i change the scale_factor 1) i get right background rgb image 2) Btw, edges of trees and buildings are blurry, do you have any advice to alleviate it. thank you. image

AmazingRoad commented 1 year ago

@Alexanderisgod Hi, dose 'NSGwm4DataParserConfig' mean waymo dataset? Could you share these code?

Alexanderisgod commented 1 year ago

@Alexanderisgod Hi, dose 'NSGwm4DataParserConfig' mean waymo dataset? Could you share these code?

sorry, it's not waymo dataset, it's our owns and i write a new dataparser to suit it.

Alexanderisgod commented 1 year ago

@Alexanderisgod Hi, dose 'NSGwm4DataParserConfig' mean waymo dataset? Could you share these code?

however, the config is same with vkitti.

class NSGvkittiDataParserConfig(DataParserConfig):
    """nerual scene graph dataset parser config"""

    _target: Type = field(default_factory=lambda: NSGvkitti)
    """target class to instantiate"""
    data: Path = Path("/data1/vkitti/Scene06/clone")
    """Directory specifying location of data."""
    scale_factor: float = 0.1
    """How much to scale the camera origins by."""
    scene_scale: float = 2.0
    """How much to scale the region of interest by."""
    alpha_color: str = "white"
    """alpha color of background"""
    first_frame: int = 0
    """specifies the beginning of a sequence if not the complete scene is taken as Input"""
    last_frame: int = 237
    """specifies the end of a sequence"""
    use_object_properties: bool = True
    """ use pose and properties of visible objects as an input """
    object_setting: int = 0
    """specify wich properties are used"""
    obj_opaque: bool = True
    """Ray does stop after intersecting with the first object bbox if true"""
    box_scale: float = 1.5
    """Maximum scale for bboxes to include shadows"""
    novel_view: str = "left"
    use_obj: bool = True
    render_only: bool = False
    bckg_only: bool = False
    use_object_properties: bool = True
    near_plane: float = 0.5
    """specifies the distance from the last pose to the near plane"""
    far_plane: float = 150.0
    """specifies the distance from the last pose to the far plane"""
    dataset_type: str = "vkitti"
    obj_only: bool = False
    """Train object models on rays close to the objects only"""
    netchunk: int = 1024 * 64
    """number of pts sent through network in parallel, decrease if running out of memory"""
    chunk: int = 1024 * 32
    """number of rays processed in parallel, decrease if running out of memory"""
    max_input_objects: int = -1
    """Max number of object poses considered by the network, will be set automatically"""
    add_input_rows: int = -1
    """reshape tensor, dont change... will be refactor in the future"""
    use_depth: bool = True
    """whether the training loop contains depth"""
    split_setting: str = "reconstruction"
    use_car_latents: bool = False
    car_object_latents_path: Optional[Path] = Path("pretrain/car_nerf/latent_codes.pt")
    """path of car object latent codes"""
    car_nerf_state_dict_path: Optional[Path] = Path("pretrain/car_nerf/car_nerf.ckpt")
    """path of car nerf state dicts"""
    use_semantic: bool = False
    """whether to use semantic information"""
    semantic_path: Optional[Path] = Path("")
    """path of semantic inputs"""
    semantic_mask_classes: List[str] = field(default_factory=lambda: [])
    """semantic classes that do not generate gradient to the background model"""
wuzirui commented 1 year ago

have you tried tuning the scale_factor?

hi, dear author, according to your previous advice, i change the scale_factor

  1. i get right background rgb image
  2. Btw, edges of trees and buildings are blurry, do you have any advice to alleviate it. thank you. image

hmmm, looks like the scale factor is still not correct, may be smaller?

sonnefred commented 1 year ago

have you tried tuning the scale_factor?

hi, dear author, according to your previous advice, i change the scale_factor

  1. i get right background rgb image
  2. Btw, edges of trees and buildings are blurry, do you have any advice to alleviate it. thank you. image

hmmm, looks like the scale factor is still not correct, may be smaller?

Hi, how do you think the scale factor should be smaller? My experimental result is also not very good, and I think it's related to the scale factor, but have no idea how to adjust it ... Could you give any hint about this? Thanks.

Alexanderisgod commented 1 year ago

have you tried tuning the scale_factor?

hi, dear author, according to your previous advice, i change the scale_factor

  1. i get right background rgb image
  2. Btw, edges of trees and buildings are blurry, do you have any advice to alleviate it. thank you. image

hmmm, looks like the scale factor is still not correct, may be smaller?

Hi, how do you think the scale factor should be smaller? My experimental result is also not very good, and I think it's related to the scale factor, but have no idea how to adjust it ... Could you give any hint about this? Thanks.

i tried serveral scale factor, like 0.05, 0.02, 0.1, 0.2 0.5 and 1. but it changed nothing. and i think it's maybe releated to the frames / meters, beacause i found kitti scenes are low speed, but my scenes are with 30km/h speed per hour.

sonnefred commented 1 year ago

have you tried tuning the scale_factor?

hi, dear author, according to your previous advice, i change the scale_factor

  1. i get right background rgb image
  2. Btw, edges of trees and buildings are blurry, do you have any advice to alleviate it. thank you. image

hmmm, looks like the scale factor is still not correct, may be smaller?

Hi, how do you think the scale factor should be smaller? My experimental result is also not very good, and I think it's related to the scale factor, but have no idea how to adjust it ... Could you give any hint about this? Thanks.

i tried serveral scale factor, like 0.05, 0.02, 0.1, 0.2 0.5 and 1. but it changed nothing. and i think it's maybe releated to the frames / meters, beacause i found kitti scenes are low speed, but my scenes are with 30km/h speed per hour.

Thanks for your reply. I also tried many values of scale factor, but it didn't affect the result a lot. Now I also train the model on my own dataset, but there is some ghost shadow when I remove the objects from the scene, do you have any suggestions about this? Thanks.

wuzirui commented 1 year ago

Thanks for your reply. I also tried many values of scale factor, but it didn't affect the result a lot. Now I also train the model on my own dataset, but there is some ghost shadow when I remove the objects from the scene, do you have any suggestions about this? Thanks.

This "stretching" artifact is usually caused by the scene contraction of NeRFacto. Our solution to this kind of artifact are:

  1. tuning scale_factor so that the stretched scene parts are in the [-1, 1] bbox.
  2. try networks with larger capacity. e.g. NeRFacto-big/huge, etc.

We suggest tunning the parameters on nerfacto before you try mars, since its much faster to run.

sonnefred commented 1 year ago

Thanks for your reply. I also tried many values of scale factor, but it didn't affect the result a lot. Now I also train the model on my own dataset, but there is some ghost shadow when I remove the objects from the scene, do you have any suggestions about this? Thanks.

This "stretching" artifact is usually caused by the scene contraction of NeRFacto. Our solution to this kind of artifact are:

1. tuning scale_factor so that the stretched scene parts are in the [-1, 1] bbox.

2. try networks with larger capacity. e.g. NeRFacto-big/huge, etc.

We suggest tunning the parameters on nerfacto before you try mars, since its much faster to run.

Hi, thanks for your reply, it's important to me. And could you explain a bit more on the first question, how to verify the scene parts are in the [-1, 1] bbox. Now I just tune the scale factor based on the experiment result, which costs too much time and labor. Thanks.

wuzirui commented 1 year ago

You can find more details on that in the MipNeRF-360 / Nerfstudio paper. Basically, your scene content is contracted (and thus may show degraded renderings) if the sampled point is outside the [-1, 1]^3 bounding box. A tip is to examine the XYZ ranges of all the sampled points.

sonnefred commented 1 year ago

You can find more details on that in the MipNeRF-360 / Nerfstudio paper. Basically, your scene content is contracted (and thus may show degraded renderings) if the sampled point is outside the [-1, 1]^3 bounding box. A tip is to examine the XYZ ranges of all the sampled points.

Ok, I will refer to these papers. Do you have any empirical values for this parameter? Does it mean the scene is not contracted if I set it to 1? Thanks.

wuzirui commented 1 year ago

Not really, by "contraction", we mean the points that are outside the box are modeled with limited model capacity. However, the point coordinates are determined by your data itself, and the scale_factor is the parameter we used to scale the input data into the [-1, 1] range.

sonnefred commented 1 year ago

Not really, by "contraction", we mean the points that are outside the box are modeled with limited model capacity. However, the point coordinates are determined by your data itself, and the scale_factor is the parameter we used to scale the input data into the [-1, 1] range.

OK,do you mean I should check the XYZ coordinates of the sampled points? Where can I check it in the code? Thanks.

wuzirui commented 1 year ago

I personally recommend using the VSCode Debug Console and set up checkpoints in the forward process.

sonnefred commented 1 year ago

I personally recommend using the VSCode Debug Console and set up checkpoints in the forward process.

Hi, I added checkpoints to check the range of XYZ coordinates like this, but I got a problem. In the picture below, I set the scale factor to 0.1, and the range of self.origins is from -1.5 to 1.5, but the self.start and self.end are round 0.05 and 1000, which is set in scene_graph.py. In this case, how can the range of pos be assured in [-1, 1]? Should I also scale the self.start and self.end or change the value of far_plane in scene_graph.py? Thanks.

图片
JiantengChen commented 1 year ago

I personally recommend using the VSCode Debug Console and set up checkpoints in the forward process.

Hi, I added checkpoints to check the range of XYZ coordinates like this, but I got a problem. In the picture below, I set the scale factor to 0.1, and the range of self.origins is from -1.5 to 1.5, but the self.start and self.end are round 0.05 and 1000, which is set in scene_graph.py. In this case, how can the range of pos be assured in [-1, 1]? Should I also scale the self.start and self.end or change the value of far_plane in scene_graph.py? Thanks. 图片

Hi! You can refer to the Proposal Sampler used in NeRFacto. Proposal sampler consolidates the sample locations to the regions of the scene that contribute most to the final render, which uses density field to guide sampling. You are supposed to ensure that the range of pos within [-1, 1]. When you have a well-trained and accurate density field, minor deviations beyond this range will not have a significant impact.