ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.03k stars 16.17k forks source link

Segmentation training error #9718

Closed NSBCypher closed 1 year ago

NSBCypher commented 2 years ago

Search before asking

YOLOv5 Component

No response

Bug

Hi guys!

When I follow this guide to train a segmented data set:

https://blog.roboflow.com/train-yolov5-instance-segmentation-custom-dataset/

I get the following error:

IndexError: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 16

I tried training locally and on Colab, go the same error.

Please find the full output below:

segment/train: weights=weights/yolov5s-seg.pt, cfg=, data=testsegment-1/data.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=100, batch_size=1, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train-seg, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, mask_ratio=4, no_overlap=False
github: up to date with https://github.com/ultralytics/yolov5 ✅
YOLOv5 🚀 v6.2-185-ge4398cf Python-3.7.14 torch-1.12.1+cu113 CUDA:0 (Tesla T4, 15110MiB)

hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0
TensorBoard: Start with 'tensorboard --logdir runs/train-seg', view at http://localhost:6006/
Overriding model.yaml nc=80 with nc=12

                 from  n    params  module                                  arguments                     
  0                -1  1      3520  models.common.Conv                      [3, 32, 6, 2, 2]              
  1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
  2                -1  1     18816  models.common.C3                        [64, 64, 1]                   
  3                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]               
  4                -1  2    115712  models.common.C3                        [128, 128, 2]                 
  5                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
  6                -1  3    625152  models.common.C3                        [256, 256, 3]                 
  7                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]              
  8                -1  1   1182720  models.common.C3                        [512, 512, 1]                 
  9                -1  1    656896  models.common.SPPF                      [512, 512, 5]                 
 10                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 11                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1    361984  models.common.C3                        [512, 256, 1, False]          
 14                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 15                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     90880  models.common.C3                        [256, 128, 1, False]          
 18                -1  1    147712  models.common.Conv                      [128, 128, 3, 2]              
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1    296448  models.common.C3                        [256, 256, 1, False]          
 21                -1  1    590336  models.common.Conv                      [256, 256, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1   1182720  models.common.C3                        [512, 512, 1, False]          
 24      [17, 20, 23]  1    431737  models.yolo.Segment                     [12, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], 32, 128, [128, 256, 512]]
Model summary: 225 layers, 7437881 parameters, 7437881 gradients, 26.0 GFLOPs

Transferred 361/367 items from weights/yolov5s-seg.pt
AMP: checks passed ✅
optimizer: SGD(lr=0.01) with parameter groups 60 weight(decay=0.0), 63 weight(decay=0.0005), 63 bias
albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))
train: Scanning '/content/yolov5/testsegment-1/train/labels.cache' images and labels... 134 found, 0 missing, 0 empty, 0 corrupt: 100% 134/134 [00:00<?, ?it/s]
val: Scanning '/content/yolov5/testsegment-1/valid/labels.cache' images and labels... 48 found, 0 missing, 0 empty, 0 corrupt: 100% 48/48 [00:00<?, ?it/s]

AutoAnchor: 4.30 anchors/target, 0.986 Best Possible Recall (BPR). Current anchors are a good fit to dataset ✅
Plotting labels to runs/train-seg/exp5/labels.jpg... 
Image sizes 640 train, 640 val
Using 0 dataloader workers
Logging results to runs/train-seg/exp5
Starting training for 100 epochs...

      Epoch    GPU_mem   box_loss   seg_loss   obj_loss   cls_loss  Instances       Size
  0% 0/134 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "segment/train.py", line 674, in <module>
    main(opt)
  File "segment/train.py", line 570, in main
    train(opt.hyp, opt, device, callbacks)
  File "segment/train.py", line 293, in train
    for i, (imgs, targets, paths, _, masks) in pbar:  # batch ------------------------------------------------------
  File "/usr/local/lib/python3.7/dist-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/content/yolov5/utils/dataloaders.py", line 171, in __iter__
    yield next(self.iterator)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 681, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 721, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/yolov5/utils/segment/dataloaders.py", line 113, in __getitem__
    img, labels, segments = self.load_mosaic(index)
  File "/content/yolov5/utils/segment/dataloaders.py", line 261, in load_mosaic
    border=self.mosaic_border)  # border to remove
  File "/content/yolov5/utils/segment/augmentations.py", line 102, in random_perspective
    new_segments = np.array(new_segments)[i]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 16

My dataset has polygons in this format, exported from roboflow: 8 0.9987424484374999 0.3347222234375 0.75713305625 0.29444444375 0.05729895625 0.89305555625 0.3336223390625 0.9847222234374999 0.9987424484374999 0.3347222234375

Environment

No response

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

glenn-jocher commented 2 years ago

@NSBCypher if you can train correctly with --data coco128-seg.yaml then there is a problem with your custom dataset. The roboflow team should be able to help you with that.

NSBCypher commented 2 years ago

@glenn-jocher it works with coco128-seg.yaml, the difference is that coco128-seg.yaml is entirely polygon format while my model has some in polygon and some in bounding box format. Should the entire dataset be polygons to work with segmentation on yolov5? Even bounding boxes should be turned to polygon format?

glenn-jocher commented 2 years ago

@Jacobsolawetz can you take a look at this? Thanks!

glenn-jocher commented 2 years ago

@NSBCypher training a segmentation model requires segmentation labels. It's not possible to mix labels from different tasks.

yeldarby commented 2 years ago

@NSBCypher training a segmentation model requires segmentation labels. It's not possible to mix labels from different tasks.

@glenn-jocher we actually have a fork that adds support for this into the dataloader that we've been using internally; is that something you'd be interested in having us contribute back?

glenn-jocher commented 2 years ago

@yeldarby hmm interesting, sure send it over and I'll take a look. If we have mixed boxes and segments though, i.e. if we have a dataset with 100 person boxes-only and 100 person segments+boxes, is the idea that the box-only labels do not incur segment losses?

yeldarby commented 2 years ago

Our fork just trusts that the user decided a bbox was an appropriate approximation of the object's shape.

Actually now that I look more closely at this issue, we had added it so people got the benefits of polygons for object detection & could take advantage of eg copy/paste augmentation even they had only labeled some of their objects as polygons) -- not sure if the recent segmentation additions will break it again or not since we forked from an older version.

We'll investigate & report back. Alternative would be converting boxes to a 4-vertex polygon.

NSBCypher commented 2 years ago

@yeldarby yes, what I did now is converted the polygons into boxes, training the model to see how it performs. Next I will convert the boxes to 4 vertex polygons and train as segmentation and see how it compares. Thank you guys ♥️

NSBCypher commented 2 years ago

@yeldarby any tips on the best method or a small snippet to convert yolov5 format boxes to 4-vertex polygons?

NSBCypher commented 2 years ago
            const segmentx = Number(w) / 2;
            const segmenty = Number(h) / 2;
            const minx = Math.max(0, Number(xmid) - segmentx);
            const maxx = Math.min(1, Number(xmid) + segmentx);
            const miny = Math.max(0, Number(ymid) - segmenty);
            const maxy = Math.min(1, Number(ymid) + segmenty);

            const x1 = minx;
            const y1 = miny;

            const x2 = maxx;
            const y2 = miny;

            const x3 = maxx;
            const y3 = maxy;

            const x4 = minx;
            const y4 = maxy;
            const newline = category + ' ' + x1 + ' ' + y1 + ' ' + x2 + ' ' + y2 + ' ' + x3 + ' ' + y3 + ' ' + x4 + ' ' + y4;

Got it!

satpalsr commented 1 year ago

@glenn-jocher In case, a single object is "separated" into two or more parts, we can label it with two or more polygons in annotations as a "list of list" as in the coco stuff dataset json files.

How should I do it with yolo polygon format txt files? Note: I don't want to consider the two polygons as separate objects because they are actually part of the same object.

glenn-jocher commented 1 year ago

@Laughing-q what's the right labelling approach here for split objects (i.e. car behind a tree) if users are labelling directly in YOLO segmentation format?

Laughing-q commented 1 year ago

@satpalsr @glenn-jocher pic-selected-221012-2004-43 Let's say there are two parts belong one object. We use a thin tiny line to connect multi-parts which make these become one object. Just follow the order of numbers(1 --> 15) to label segments. Actually the 4 and 12 should at the same position(so are 5 and 11), the two lines in the picture are for a more intuitive look. Btw the 4(12) and 5(11) should be the closest coordinates between two parts.

glenn-jocher commented 1 year ago

@Laughing-q ah got it! Are you sure there's no unique() calls in the YOLOv5 segmentation dataloaders? I don't recall any but haven't checked the code recently.

Laughing-q commented 1 year ago

@glenn-jocher you mean this? https://github.com/ultralytics/yolov5/blob/7a69035eb8a15f44a1dc8f1e07ee71b674e98271/utils/dataloaders.py#L988-L992 then I think we have it.

glenn-jocher commented 1 year ago

@Laughing-q ah, this is actually ok. This is a duplicate row check in a labels.txt file. This only cares about two labels being exactly the same in a single image (duplicate labels), it doesn't care about what's in each label.

Laughing-q commented 1 year ago

@glenn-jocher yeah we dont unique coordinates in each label, so everything's fine here.

glenn-jocher commented 1 year ago

Yes correct.

github-actions[bot] commented 1 year ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

liuhhx commented 1 year ago

@NSBCypher if you can train correctly with --data coco128-seg.yaml then there is a problem with your custom dataset. The roboflow team should be able to help you with that.

I can't train correctly with --data coco128-seg.yaml and get same error but can train correctly with --data coco128.yaml using detect model.

IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 28
WongVi commented 1 year ago

@glenn-jocher @NSBCypher @Laughing-q
Could you please help to solve this issue about training segmentation yolo . how can I convert ICDAR data format to segmentation training format? The coordinate of text are 8 points coordinate. I want to convert it into yolov5 segmentation training data format. Please help me. The format of text file are cls_id x1 y1 x2 y2 x3 y3 x4 y4 image

following

glenn-jocher commented 1 year ago

@liuhhx segmentation training works correctly. Follow tutorial notebook for usage examples: https://colab.research.google.com/github/ultralytics/yolov5/blob/master/segment/tutorial.ipynb

@WongVi the format you show is already in YOLOv5 segmentation format. See Colab notebook above for COCO128-seg training, you can view the label text files there for more details.

WongVi commented 1 year ago

@glenn-jocher when I start training I got error. could you please let me know How can I solve it. I also want to ignore warnings by normalizing data. could you please share idea about it too.

I think due to the problem of normalization data I am facing this issue but I don't know any idea.

image

glenn-jocher commented 1 year ago

@WongVi 👋 Hello! Thanks for asking about YOLOv5 🚀 dataset formatting. To train correctly your data must be in YOLOv5 format. Please see our Train Custom Data tutorial for full documentation on dataset setup and all steps required to start training your first model. A few excerpts from the tutorial:

1.1 Create dataset.yaml

COCO128 is an example small tutorial dataset composed of the first 128 images in COCO train2017. These same 128 images are used for both training and validation to verify our training pipeline is capable of overfitting. data/coco128.yaml, shown below, is the dataset config file that defines 1) the dataset root directory path and relative paths to train / val / test image directories (or *.txt files with image paths) and 2) a class names dictionary:

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128  # dataset root dir
train: images/train2017  # train images (relative to 'path') 128 images
val: images/train2017  # val images (relative to 'path') 128 images
test:  # test images (optional)

# Classes (80 COCO classes)
names:
  0: person
  1: bicycle
  2: car
  ...
  77: teddy bear
  78: hair drier
  79: toothbrush

1.2 Create Labels

After using a tool like Roboflow Annotate to label your images, export your labels to YOLO format, with one *.txt file per image (if no objects in image, no *.txt file is required). The *.txt file specifications are:

The label file corresponding to the above image contains 2 persons (class 0) and a tie (class 27):

1.3 Organize Directories

Organize your train and val images and labels according to the example below. YOLOv5 assumes /coco128 is inside a /datasets directory next to the /yolov5 directory. YOLOv5 locates labels automatically for each image by replacing the last instance of /images/ in each image path with /labels/. For example:

../datasets/coco128/images/im0.jpg  # image
../datasets/coco128/labels/im0.txt  # label

Good luck 🍀 and let us know if you have any other questions!

WongVi commented 1 year ago

@glenn-jocher

@liuhhx segmentation training works correctly. Follow tutorial notebook for usage examples: https://colab.research.google.com/github/ultralytics/yolov5/blob/master/segment/tutorial.ipynb

@WongVi the format you show is already in YOLOv5 segmentation format. See Colab notebook above for COCO128-seg training, you can view the label text files there for more details.

Please just let me know how can I normalize x1,y1,x2,y2,x3,y3,x4,y4 labelme coordinate to train the segmentation module. I tried to find way but there is no any solution to train those format data and format changing methods too.

I trained the detection module with this dataset but for segmentation, I am facing errors as shown below. you just point out about detection training dataset format always. image

f-izzat commented 1 year ago

Hi @glenn-jocher , I got the same error here (using the same example as OP but with my own dataset) . IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 4

My labels are of the bounding-box format: 0 0.4825 0.2541 0.6678 0.3209

glenn-jocher commented 1 year ago

@f-izzat segmentation training requires segment labels, you have a box label shown. segment labels are cls, xy1, xy2, xy3, xy4, etc...

hdnh2006 commented 1 year ago

I am having same error. Surely there'es something wrong with my dataset as @glenn-jocher says, but the log showed is not specific enough.

Somebody could check what was the error with the dataset?

glenn-jocher commented 1 year ago

@hdnh2006 thanks for the bug report! Are you using a Roboflow segmentation dataset also? What is the error message?

hdnh2006 commented 1 year ago

@hdnh2006 thanks for the bug report! Are you using a Roboflow segmentation dataset also? What is the error message?

Hi @glenn-jocher it is a personal dataset. It wasn't downloaded from roboflow or any other sourcer. I was labeled using coco annotations and then transformed into yolo format using your json2yolo code.

I am trying to debug and I'll let you know any information.

Thanks in advance.

f-izzat commented 1 year ago

Hi @glenn-jocher , sorry yes. I just followed the roboflow steps and managed to get the correct format for instance segmentation i.e exported as YoloV5 format

Though the same error pops up

Traceback (most recent call last):
  File "/content/drive/MyDrive/GIT/yolov5/segment/train.py", line 658, in <module>
    main(opt)
  File "/content/drive/MyDrive/GIT/yolov5/segment/train.py", line 554, in main
    train(opt.hyp, opt, device, callbacks)
  File "/content/drive/MyDrive/GIT/yolov5/segment/train.py", line 283, in train
    for i, (imgs, targets, paths, _, masks) in pbar:  # batch ------------------------------------------------------
  File "/usr/local/lib/python3.7/dist-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/content/drive/MyDrive/GIT/yolov5/utils/dataloaders.py", line 171, in __iter__
    yield next(self.iterator)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 681, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 461, in reraise
    raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/drive/MyDrive/GIT/yolov5/utils/segment/dataloaders.py", line 114, in __getitem__
    img, labels, segments = self.load_mosaic(index)
  File "/content/drive/MyDrive/GIT/yolov5/utils/segment/dataloaders.py", line 262, in load_mosaic
    border=self.mosaic_border)  # border to remove
  File "/content/drive/MyDrive/GIT/yolov5/utils/segment/augmentations.py", line 102, in random_perspective
    new_segments = np.array(new_segments)[i]

IndexError: boolean index did not match indexed array along dimension 0; dimension is 8 but corresponding boolean dimension is 14

Basically i labelled my images using labelme and imported to roboflow to convert into YoloV5 format. My dataset contains a mixture of polygons and rectangles Example of label file 1:

1 0.24140625 0.1296875 0.4546875 0.259375
2 0.734375 0.1296875 0.53125 0.259375
2 0.234375 0.6296875 0.46875 0.740625
1 0.734375 0.6296875 0.53125 0.740625

Example of label file 2:

1 0.2546875 0.29453125 0.509375 0.5890625
3 0.509375 0.5890625 1 0.5890625 1 0 0.509375 0 0.509375 0.5890625
1 0.2546875 0.79609375 0.509375 0.4078125
2 0.7546875 0.79453125 0.490625 0.4109375

Note: I ran the segment tutorial notebook and works just fine

glenn-jocher commented 1 year ago

@f-izzat yes this is because your format is incorrect. Training a segmentation model requires segmentation labels. It's not possible to mix labels from different tasks (you've got box and segment labels intermingled).

WongVi commented 1 year ago

I successfully trained the segmentation network using the labeled me data format and compare results with yolo detection. but the segmentation methods result is lower than the detection methods.

glenn-jocher commented 1 year ago

@WongVi can you clarify?

timothylimyl commented 1 year ago

@satpalsr @glenn-jocher pic-selected-221012-2004-43 Let's say there are two parts belong one object. We use a thin tiny line to connect multi-parts which make these become one object. Just follow the order of numbers(1 --> 15) to label segments. Actually the 4 and 12 should at the same position(so are 5 and 11), the two lines in the picture are for a more intuitive look. Btw the 4(12) and 5(11) should be the closest coordinates between two parts.

Not very clear on this. Since most data format actually separated polygons under the same grouping (object), how does this translate to yolov5-seg format? My concern is that wrong polygon information is given to the model during training for objects that have more than 1 polygon.

Maybe we can add a flag id at the end to indicate grouping?

If we are following the yolo-seg format now, you will not be able to visualise the polygons that are separated properly as you have already lost that information upon joining all the points together.

github-actions[bot] commented 1 year ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

glenn-jocher commented 1 year ago

@minazamani7 the thin lines displayed in the image are used to connect disjoint parts of an object and help the annotation process. In practice, you should not create a new shape by joining multiple disjoint polygons. Instead, each object should be represented as a separate contour. The order in which you number the segments does not matter for the training. If an object has multiple contours they should be labeled as separate segments (e.g., two slices of the same cucumber).

For your use case, you may want to consider using a dictionary approach where each object ID has a list of polygons that belong to it. Then, during training, you can convert the polygons of the same object to a mask for that object so that you do not lose this information during training.

Please note that the primary purpose of joining segments in YOLOv5 format is to represent deformations for a single object (e.g., the open-lid of a box) and should not be used to represent multiple objects.