How to train your own dataset

biyuefeng commented 7 months ago

Hello, I currently have my own dataset labeled with labelme, which has been converted to Coco format. The list is as follows:

${DATASET_ROOT} # 数据集根目录，例如：/home/username/data/NWPU ├── annotations │ ├── train.json │ ├── val.json │ └── test.json └── images ├── train ├── val └── test

Excuse me, can this be trained directly?

EricZavier commented 6 months ago

I also want to train my own dataset, but it seems the situation is not very promising.If you find a solution, please share it. Thank you very much.

jwyang commented 6 months ago

Hi, there are roughly two steps:

register your dataset following the sample code in: https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once/tree/v1.0/datasets/registration
create a dataset mapper following the sample code in: https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once/tree/v1.0/datasets/dataset_mappers

you can refer to the code for coco or other datasets used in our training.

EricZavier commented 6 months ago

"Okay, thank you very much for your patient explanation. I'll give it a try now."

EricZavier commented 6 months ago

"I have another question. How can I construct JSON files like those in the image based on my annotated dataset?"

Hi, there are roughly two steps:

1. register your dataset following the sample code in: https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once/tree/v1.0/datasets/registration

2. create a dataset mapper following the sample code in: https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once/tree/v1.0/datasets/dataset_mappers

you can refer to the code for coco or other datasets used in our training.

MaureenZOU commented 6 months ago

To construct JSON file please ask GPT4. Hhh

simranbajaj06 commented 1 month ago

@MaureenZOU @jwyang i have coco json format for my custom dataset and i have used the coco_panoptic_new_baseline_dataset_mapper.py for mapper

after running training command i am getting this error

File "/home/hp-atc/labellerr/Segment-Everything-Everywhere-All-At-Once/modeling/interface/prototype/attention_data_struct_seemv1.py", line 290, in update_spatial_results v_emb = results['pred_smaskembs'] KeyError: 'pred_smaskembs'

although i am getting this key in log file when i print the result keys- Input results keys: dict_keys(['aux_outputs', 'pred_logits', 'pred_masks', 'pred_gmasks', 'pred_smasks', 'pred_captions', 'pred_gtexts', 'pred_stexts', 'pred_smaskembs', 'pred_pspatials', 'pred_nspatials'])

2024-10-13 15:29:34,730 - INFO - Input results keys: dict_keys(['aux_outputs', 'pred_logits', 'pred_masks', 'pred_gmasks', 'pred_smasks', 'pred_captions', 'pred_gtexts', 'pred_stexts', 'pred_smaskembs', 'pred_pspatials', 'pred_nspatials'])

Input results keys: dict_keys(['prev_mask'])

2024-10-13 15:29:34,732 - INFO - Input results keys: dict_keys(['prev_mask'])

i log this key one time it logs 4 times i am using single GPU

UX-Decoder / Segment-Everything-Everywhere-All-At-Once

How to train your own dataset #140