Overfitting occurred when I trained with a small number of images.

Yuhuoo commented 6 months ago

When I used this msplat to train the Horse in Tanks & Temples dataset, I selected 24 images, with 12 for training and 12 for testing. I intended to use this project for optimization of camera poses. I found that after iterating over 3000 times, the loss started to increase. However, this situation did not occur in the native Gaussian project of https://github.com/graphdeco-inria/gaussian-splatting.

log of msplat:

(gaussian_splatting) aogao@test-X640-G40:~/code/dust_gs/gaussian-splatting$ python train.py -s /home/aogao/code/dust_gs/gaussian-splatting/data/Horse_24 -m /home/aogao/code/dust_gs/gaussian-splatting/output/InstantSplat/Horse_24_2 --ite
ration=7000 --eval
Optimizing /home/aogao/code/dust_gs/gaussian-splatting/output/InstantSplat/Horse_24_2
Output folder: /home/aogao/code/dust_gs/gaussian-splatting/output/InstantSplat/Horse_24_2 [19/05 20:52:09]
Reading camera 24/24 [19/05 20:52:09]
Generating ellipse path from 24 camera infos ... [19/05 20:52:09]
theta[0] 0.0 [19/05 20:52:09]
Train Cameras loaded 12 [19/05 20:52:33]
Test Cameras loaded 12 [19/05 20:52:33]
Render Cameras loaded 120 [19/05 20:52:37]
Number of points at initialisation :  1038289 [19/05 20:52:37]
Training progress:  14%|██████                                    | 1000/7000 [02:03<11:54,  8.39it/s, Loss=0.0707449]
[ITER 1000] Evaluating test: L1 0.06763647186259428 PSNR 17.882627328236897 [19/05 20:54:42]

[ITER 1000] Evaluating train: L1 0.04614747241139412 PSNR 19.72754669189453 [19/05 20:54:44]

[ITER 1000] Saving Gaussians [19/05 20:54:44]
Training progress: 100%|██████████████████████████████████████████| 7000/7000 [15:38<00:00,  7.46it/s, Loss=0.2689072]

[ITER 7000] Evaluating test: L1 0.25050079201658565 PSNR 8.50779887040456 [19/05 21:08:17]

[ITER 7000] Evaluating train: L1 0.2306472510099411 PSNR 8.335065460205078 [19/05 21:08:17]

[ITER 7000] Saving Gaussians [19/05 21:08:17]

Training complete. [19/05 21:08:32]

log of native Gaussian project:

(gaussian_splatting) aogao@test-X640-G40:~/code/dust_gs/gaussian-splatting$ python train.py -s /home/aogao/code/dust_gs/gaussian-splatting/data/Horse_24 -m /home/aogao/code/dust_gs/gaussian-splatting/output/dust_gs/Horse_24_2 --iteratio
n=7000 --eval
Optimizing /home/aogao/code/dust_gs/gaussian-splatting/output/dust_gs/Horse_24_2
Output folder: /home/aogao/code/dust_gs/gaussian-splatting/output/dust_gs/Horse_24_2 [19/05 20:00:53]
Reading camera 24/24 [19/05 20:00:53]
Generating ellipse path from 24 camera infos ... [19/05 20:00:53]
theta[0] 0.0 [19/05 20:00:53]
Loading Training Cameras [19/05 20:01:13]
Loading Test Cameras [19/05 20:01:23]
Loading Render Cameras [19/05 20:01:23]
Number of points at initialisation :  1038289 [19/05 20:01:27]
Training progress: 100%|██████████████████████████████████████████| 7000/7000 [09:02<00:00, 12.91it/s, Loss=0.0366106]

[ITER 7000] Evaluating test: L1 0.04568970017135143 PSNR 20.60738754272461 [19/05 20:10:31]

[ITER 7000] Evaluating train: L1 0.02718411646783352 PSNR 23.515397262573245 [19/05 20:10:32]

[ITER 7000] Saving Gaussians [19/05 20:10:32]

Training complete. [19/05 20:10:42]

Yuhuoo commented 6 months ago

The Camera class I used is:

class Camera(nn.Module):
    def __init__(self, colmap_id, R, Q, T, FoVx, FoVy, image, gt_alpha_mask,
                 image_name, uid,
                 trans=np.array([0.0, 0.0, 0.0]), scale=1.0, data_device = "cuda"
                 ):
        super(Camera, self).__init__()

        self.uid = uid
        self.colmap_id = colmap_id
        self.init_Q = torch.tensor(Q, dtype=torch.float32, device="cuda")
        self.Q = nn.Parameter(self.init_Q.requires_grad_(True))
        self.T = nn.Parameter(torch.tensor(T, dtype=torch.float32, device="cuda").requires_grad_(True))
        # self.R = R
        # self.T = T
        self.FoVx = FoVx
        self.FoVy = FoVy
        self.image_name = image_name

        try:
            self.data_device = torch.device(data_device)
        except Exception as e:
            print(e)
            print(f"[Warning] Custom device {data_device} failed, fallback to default cuda device" )
            self.data_device = torch.device("cuda")

        self.original_image = image.clamp(0.0, 1.0).to(self.data_device)
        self.image_width = self.original_image.shape[2]
        self.image_height = self.original_image.shape[1]

        if gt_alpha_mask is not None:
            self.original_image *= gt_alpha_mask.to(self.data_device)
        else:
            self.original_image *= torch.ones((1, self.image_height, self.image_width), device=self.data_device)

        self.zfar = 100.0
        self.znear = 0.01

        self.trans = trans
        self.scale = scale

        self.optimizer = torch.optim.Adam(self.parameters(), lr=0.0001)

    def get_extrinsic_camcenter(self):
        R = roma.unitquat_to_rotmat(self.Q)
        Rt = torch.zeros((4, 4), dtype=torch.float32).to(self.Q.device)
        Rt[:3, :3] = R
        Rt[:3, 3] = self.T
        Rt[3, 3] = 1.0

        extrinsic_matrix = Rt[:3, :]
        world_view_transform = Rt.transpose(0, 1)
        camera_center = world_view_transform.inverse()[3, :3]
        return extrinsic_matrix, camera_center

and Train.py just made a few modifications to the original code:

    ...
        render_pkg = ms_render(viewpoint_cam, gaussians, pipe, bg)
        image, viewspace_point_tensor, visibility_filter, radii = render_pkg["render"], render_pkg["viewspace_points"], render_pkg["visibility_filter"], render_pkg["radii"]

        ...

            # Optimizer step
            if iteration < opt.iterations:
                gaussians.optimizer.step()
                gaussians.optimizer.zero_grad(set_to_none = True)
                viewpoint_cam.optimizer.step()
                viewpoint_cam.optimizer.zero_grad(set_to_none = True)

            ...