Training On custom instance segmentation datasetset

abhiagwl4262 commented 1 year ago

I am trying to train MaskDINO with R50 backbone for my custom Instance segmentation dataset which has 3 classes.

My Data is in COCO format -

-datasets
    -coco
        -images
            aaaa.jpg
            bbb.jpg
            ...
            zzz.jpg
        -annotations
            -instances_train2017.json
            -instances_val2017.json

I am using the following command - python3 train_net.py --num-gpus 1 --config-file configs/coco/instance-segmentation/maskdino_R50_bs16_50ep_3s_dowsample1_2048.yaml MODEL.WEIGHTS pretrained_models/maskdino_r50_50ep_300q_hid2048_3sd1_instance_maskenhanced_mask46.3ap_box51.7ap.pth

I made following changes -

I have updated NUM_CLASSES in configs/coco/instance-segmentation/maskdino_R50_bs16_50ep_3s_dowsample1_2048.yaml to 3

after this change I ran but countered following error -

AssertionError: Attribute 'thing_classes' in the metadata of 'coco_2017_train' cannot be set to a different value!
['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'] != ['Facet', 'Wall', 'Extension']

Updated COCO_CATEGORIES in maskdino/data/datasets/register_coco_stuff_10k.py

COCO_CATEGORIES = [
{"color": [220, 20, 60], "isthing": 1, "id": 1, "name": "class1"},
{"color": [119, 11, 32], "isthing": 1, "id": 2, "name": "class2"},
{"color": [0, 0, 142], "isthing": 1, "id": 3, "name": "class3"},
]

After both the changes, I got the following error -

File "/home/ubuntu/abhishek/maskdino/maskdino/data/datasets/register_coco_stuff_10k.py", line 192, in _get_coco_stuff_meta
    assert len(stuff_ids) == 171, len(stuff_ids)
AssertionError: 3

Can anyone guide me on training my custom data? What changes I need to make to make this work ?

FengLi-ust commented 1 year ago

Did you add register_your_data.py instead of register_coco_stuff_10k.py for your datasets? It seems the stuff still has 171 classes.

abhiagwl4262 commented 1 year ago

@FengLi-ust Yeah I could use my custom data after registering it with register_coco_instances. Thanks for your answer. Now I am getting the following error while doing einstein sum

  File "/home/ubuntu/abhishek/maskdino/maskdino/modeling/transformer_decoder/maskdino_decoder.py", line 502, in forward_prediction_heads
    decoder_output = self.decoder_norm(output)
  File "/home/ubuntu/anaconda3/envs/maskdino/lib/python3.8/site-packages/torch/functional.py", line 378, in einsum
    return _VF.einsum(equation, operands)  # type: ignore[attr-defined]
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmStridedBatchedExFix( handle, opa, opb, m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, (void*)(&fbeta), c, CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`

philiphaddad97 commented 1 year ago

I did the following and it's worked for me.

Disabled their registers

Add this function

def register_custom_coco_dataset(dataset_path: str) -> None:
annotations_path = dataset_path + "annotations/"
register_coco_instances(
    "coco_train",
    {},
    annotations_path + "instances_train2017.json",
    dataset_path + "train2017",
)
register_coco_instances(
    "coco_valid",
    {},
    annotations_path + "instances_val2017.json",
    dataset_path + "val2017",
)

In the setup function add these:

def setup(args):
cfg.DATASETS.TRAIN = ("coco_train",)
cfg.DATASETS.TEST = ("coco_valid",)

abhiagwl4262 commented 1 year ago

Yeah @philiphaddad97, I also registered my custom dataset to maskdino framework like the above.

abhiagwl4262 commented 1 year ago

@FengLi-ust Yeah I could use my custom data after registering it with register_coco_instances. Thanks for your answer. Now I am getting the following error while doing einstein sum

  File "/home/ubuntu/abhishek/maskdino/maskdino/modeling/transformer_decoder/maskdino_decoder.py", line 502, in forward_prediction_heads
    decoder_output = self.decoder_norm(output)
  File "/home/ubuntu/anaconda3/envs/maskdino/lib/python3.8/site-packages/torch/functional.py", line 378, in einsum
    return _VF.einsum(equation, operands)  # type: ignore[attr-defined]
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmStridedBatchedExFix( handle, opa, opb, m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, (void*)(&fbeta), c, CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`

I could resolve this. The issue was with my setup. The pytorch I installed got compiled with cuda 11.7 while my system has cuda 11.3. So I installed dependencies more carefully and I could resolve the issue.

FengLi-ust commented 1 year ago

Cool!

maichm commented 1 year ago

It seems hard to start trainging

IDEA-Research / MaskDINO

Training On custom instance segmentation datasetset #44