Adaptation to custom dataset

luigifaticoso commented 3 years ago

Hello, I'm trying to train the model on my custom dataset. I have created my synthetic dataset as similar as possible to the Linemod_preprocessed.

My steps:

Created a folder in datasets copied from linemod folder.
Modified the folder obj_kps putting the output generated by python3 gen_obj_info.py --help, the generated files only had center.txt, corners.txt, farthest.txt and radius.txt. It is not similar to the linemod dataset with farthest4.txt, farthest12.txt etc. - First question: How do I generate the other farthest missing files similar to linemod?
I have changed the dataset_config/models_info.yml based on my measures obtained in the previous step
I have modified the mydataset_dataset.py changing the width and height of my images, changing the references to linemod and pointing them to the config file of my new dataset
Changed the preprocess_testset based on my needs

in file common.py I have added my new dataset adding this part (doing copy and paste of linemod) :


    elif self.dataset_name == 'blm':
        self.n_objects = 1 + 1
        self.n_classes = 1 + 1
        self.blm_cls_lst = [
            1
        ]
        self.blm_obj_dict={
            'blm':1,
        }
        self.blm_id2obj_dict = dict(
            zip(self.blm_obj_dict.values(), self.blm_obj_dict.keys())
        )
        self.blm_root = os.path.abspath(
            os.path.join(self.exp_dir, 'datasets/blm/')
        )
        self.blm_kps_dir = os.path.abspath(
            os.path.join(self.exp_dir, 'datasets/blm/blm_obj_kps/')
        )
        self.val_test_pkl_p = os.path.join(
            self.exp_dir, 'datasets/blm/test_val_data.pkl',
        )
        prep_fd = os.path.join(
            self.blm_root, "preprocess_testset"
        )
        ensure_fd(prep_fd)
        self.preprocessed_testset_ptn = os.path.abspath(
            os.path.join(prep_fd, '{}_pp_vts.pkl')
        )
        self.preprocessed_testset_pth = self.preprocessed_testset_ptn.format(cls_type)
        self.use_preprocess = False

        blm_r_pth = os.path.join(self.blm_root, "dataset_config/models_info.yml")
        blm_r_file = open(os.path.join(blm_r_pth), "r")
        self.blm_r_lst = yaml.load(blm_r_file)

        self.val_nid_ptn = "/data/6D_Pose_Data/datasets/BLM/pose_nori_lists/{}_real_val.nori.list"


- I have added the intrinsic Matrix of my camera to the `common.py.`

Regarding my dataset, I have generated the images and labels using BlenderProc; I have the `gt.yml`, `info.yml` and everything except for the Masks. - **Second question:** Do I need to generate the masks or is it generated in some process inside? could that be the problem?

I have modified those necessary files here documented, but I still need to change some parts.
This is the current output:
```shell
python3 -m train.train_blm_pvn3d --cls blm
/home/pytorch/PVN3D/pvn3d/common.py:173: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  self.blm_r_lst = yaml.load(blm_r_file)
/home/pytorch/PVN3D/pvn3d/common.py:134: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  self.lm_r_lst = yaml.load(lm_r_file)
/home/pytorch/PVN3D/pvn3d/common.py:173: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  self.blm_r_lst = yaml.load(blm_r_file)
cls_type:  blm
cls_id in blm_dataset.py 1
/home/pytorch/PVN3D/pvn3d/datasets/blm/blm_dataset.py:41: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  self.meta_lst = yaml.load(meta_file)
Train without rendered data from https://github.com/ethnhe/raster_triangle
Train without fuse data from https://github.com/ethnhe/raster_triangle
train_dataset_size:  10
cls_id in blm_dataset.py 1
val_dataset_size:  15
loading pretrained mdl.
/usr/local/lib/python3.7/dist-packages/torch/nn/_reduction.py:43: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
  warnings.warn(warning.format(ret))
{'bn_decay': 0.5,
 'bn_momentum': 0.9,
 'cal_metrics': False,
 'checkpoint': None,
 'cls': 'blm',
 'decay_step': 200000.0,
 'epochs': 1000,
 'eval_net': False,
 'lr': 0.01,
 'lr_decay': 0.5,
 'run_name': 'sem_seg_run_1',
 'test': False,
 'test_occ': False,
 'weight_decay': 0}
epochs:   0%|                                                                                                                                      | 0/25 [00:00<?, ?it/s]
train:   0%|                                                                                                                                     | 0/5000 [00:00<?, ?it/s]

How can I have more output and understand where it's wrong?

ethnhe commented 3 years ago

Hi,

For your first question, the default setting is to generate 8 keypoints for each object, which is the best setting in our experiments, and we think you do not need to get other numbers of keypoint like the LineMOD dataset, which we generate for ablation study in our paper. If you want to get them, you can modify the code here, by changing the num=8 to the number you want and also the name of the saving file.
For your second question, yes, you need to provide the masks of target objects because our model needs to train the instance semantic segmentation branch. If you train your model for a single class of objects, you can follow the setting of LineMOD and set 1 as target objects and 0 as background in your mask. If you don't know how to generate the mask of an object with BlenderProc, you can also try our raster_triangle to generate it with object ply model, gt object pose, and camera parameters.
After these settings are finished, run the dataset provider scripts, mydataset_dataset.py in your case to visualize the data and check if all the scripts run correctly.

luigifaticoso commented 3 years ago

Ok Perfect, thank you a lot for your message. I have managed to generate the 8 keypoints, I will maybe try different settings later. I have managed to generate the binary mask in Blender Proc, but I'm planning to ampliate the dataset using your solution raster_triangle.

I have managed to go forward, there was an issue in mydataset_dataset.py at the meta part by printing it out I have just commented and it goes forward.

                # else:
                #     meta = meta[0]
                #     print("meta: ",meta)

The next issue I'm trying to understand is this part in train_dataset_pvn3d.py around line 182 where it expects 11 elements from cu_dt but it gets 13 instead. Below is my cu_dt file, I'm trying to understand right now what is the issue here. If you happen to understand it on the spot, it would be great, I'm trying to understand from code step by step going backward.

ValueError: too many values to unpack (expected 11)

[tensor([[[[123., 119., 114.,  ...,  26.,  20.,  14.],
          [112., 120., 117.,  ...,  38.,  43.,  34.],
          [111., 115., 118.,  ...,  42.,  43.,  40.],
          ...,
       .........
        [[711.1113,   0.0000, 255.5000],
         [  0.0000, 711.1113, 255.5000],
         [  0.0000,   0.0000,   1.0000]]], device='cuda:0'), tensor([1000., 1000.], device='cuda:0')]

EDIT: Setting the DEBUG variable to False gives 11 values instead of 13 and it works.

DEBUG = False

ethnhe commented 3 years ago

The DEBUG setting return more information for visualization of preprocessed data in the dataset.py, LineMOD for example. And it's highly recommended to visualize and check if the data are processed properly by visualizing them with python3 -m datasets.linemod.linemod_dataset. Which gives the picture as follows: linemod_vis_data So you can check the center point (blue point) and keypoints (red points) are labeled correctly.

luigifaticoso commented 3 years ago

Thank you for helping, I'm trying to debug the labels because they are not correct. I'm trying to find If I have homogenous model data to the linemod dataset. I have tried to use python3 gen_obj_info.py on the ape model provided by linemod but doesn't seem to have the same numbers as the original. I have scaled down the ply model to m (to have homogenous results ) and use the gen_obj_info, and this is the output ( on the right ) compared with the one provided by linemod dataset : Screenshot from 2021-02-16 15-18-07 My labels are wrong. I am going to share the output of python3 -m datasets.linemod.linemod_dataset This is by scaling the model to m:

train_0_rgb

instead scaled to mm it's totally out of the image

ethnhe commented 3 years ago

The keypoints are in m and are generated by the FPS algorithm with random initialization. The selected points may be different but that won't affect the performance much. For your case, if the generated keypoints of your model are in m, you should check that your GT poses are in 'cv' coordinate system, rather than default poses from Blender.

luigifaticoso commented 3 years ago

Thank you for your message, I started debugging piece by piece and managed to arrive at this point, where the center point is ok but the others are not. Where can I investigate more in this case? I have printed out the points from white ( farthest ) to black ( nearest ) points and this happens.

4_rgb

2_rgb

I have checked my model.ply and doesn't seem to have any problem. Any help to things I could check out could be really helpful! thank you

ethnhe commented 3 years ago

It seems like the left-hand coordinate system and right-hand-coordinate system issue. We use the right-hand coordinate system and it seems like your model is in the left-hand coordinate system. Try to flip the y-axis of each vertex and keypoint of the object in the object coordinate system.

Ultraopxt commented 3 years ago

are in m and are generated by the FPS algorithm with random initialization. The selected points may be different but that won't affect the performance much. For your case, if the generated keypoints of your model are in m, you should check that your GT poses are in 'cv' coordinate system, rather than default poses from Blender.

May I ask original linemod model.ply and keypoints generated form FPS are in mm or in m? Cause my keypoints generated from FPS are inaccurate.

ethnhe commented 3 years ago

The vertexes in the model.ply are all converted to be in m, and the keypoints selected from them with FPS are also in m.

Ultraopxt commented 3 years ago

The vertexes in the model.ply are all converted to be in m, and the keypoints selected from them with FPS are also in m.

Thanks very much! I visualize my datasets,and keypoints and center point are as followings: test_rgb (6) test_rgb (5) test_rgb (4)

May I ask how can I slove the position deviation of keypoints and center point?

ethnhe commented 3 years ago

Not very sure, but it seems that maybe the ground truth pose transforming the object from the object coordinate system to the camera coordinate system is not calculated properly. Also, check that the object coordinate system and the camera coordinate system are all right-hand coordinate systems. You can also transform the mesh vertexes to the camera coordinate system and project them to the image to check the pose parameters.

Ultraopxt commented 3 years ago

Not very sure, but it seems that maybe the ground truth pose transforming the object from the object coordinate system to the camera coordinate system is not calculated properly. Also, check that the object coordinate system and the camera coordinate system are all right-hand coordinate systems. You can also transform the mesh vertexes to the camera coordinate system and project them to the image to check the pose parameters.

Thanks very much!

andreazuna89 commented 1 year ago

@luigifaticoso can you write here how can you generate gt.yml and info.yml for a custom dataset?

Thanks

ethnhe / PVN3D

Adaptation to custom dataset #64

My steps: