Closed NSBCypher closed 1 year ago
@NSBCypher if you can train correctly with --data coco128-seg.yaml
then there is a problem with your custom dataset. The roboflow team should be able to help you with that.
@glenn-jocher it works with coco128-seg.yaml
, the difference is that coco128-seg.yaml
is entirely polygon format while my model has some in polygon and some in bounding box format. Should the entire dataset be polygons to work with segmentation on yolov5? Even bounding boxes should be turned to polygon format?
@Jacobsolawetz can you take a look at this? Thanks!
@NSBCypher training a segmentation model requires segmentation labels. It's not possible to mix labels from different tasks.
@NSBCypher training a segmentation model requires segmentation labels. It's not possible to mix labels from different tasks.
@glenn-jocher we actually have a fork that adds support for this into the dataloader that we've been using internally; is that something you'd be interested in having us contribute back?
@yeldarby hmm interesting, sure send it over and I'll take a look. If we have mixed boxes and segments though, i.e. if we have a dataset with 100 person boxes-only and 100 person segments+boxes, is the idea that the box-only labels do not incur segment losses?
Our fork just trusts that the user decided a bbox was an appropriate approximation of the object's shape.
Actually now that I look more closely at this issue, we had added it so people got the benefits of polygons for object detection & could take advantage of eg copy/paste augmentation even they had only labeled some of their objects as polygons) -- not sure if the recent segmentation additions will break it again or not since we forked from an older version.
We'll investigate & report back. Alternative would be converting boxes to a 4-vertex polygon.
@yeldarby yes, what I did now is converted the polygons into boxes, training the model to see how it performs. Next I will convert the boxes to 4 vertex polygons and train as segmentation and see how it compares. Thank you guys ♥️
@yeldarby any tips on the best method or a small snippet to convert yolov5 format boxes to 4-vertex polygons?
const segmentx = Number(w) / 2;
const segmenty = Number(h) / 2;
const minx = Math.max(0, Number(xmid) - segmentx);
const maxx = Math.min(1, Number(xmid) + segmentx);
const miny = Math.max(0, Number(ymid) - segmenty);
const maxy = Math.min(1, Number(ymid) + segmenty);
const x1 = minx;
const y1 = miny;
const x2 = maxx;
const y2 = miny;
const x3 = maxx;
const y3 = maxy;
const x4 = minx;
const y4 = maxy;
const newline = category + ' ' + x1 + ' ' + y1 + ' ' + x2 + ' ' + y2 + ' ' + x3 + ' ' + y3 + ' ' + x4 + ' ' + y4;
Got it!
@glenn-jocher In case, a single object is "separated" into two or more parts, we can label it with two or more polygons in annotations as a "list of list" as in the coco stuff dataset json files.
How should I do it with yolo polygon format txt files? Note: I don't want to consider the two polygons as separate objects because they are actually part of the same object.
@Laughing-q what's the right labelling approach here for split objects (i.e. car behind a tree) if users are labelling directly in YOLO segmentation format?
@satpalsr @glenn-jocher
Let's say there are two parts belong one object. We use a thin tiny line to connect multi-parts which make these become one object.
Just follow the order of numbers(1 --> 15) to label segments. Actually the 4
and 12
should at the same position(so are 5
and 11
), the two lines in the picture are for a more intuitive look.
Btw the 4
(12
) and 5
(11
) should be the closest coordinates between two parts.
@Laughing-q ah got it! Are you sure there's no unique() calls in the YOLOv5 segmentation dataloaders? I don't recall any but haven't checked the code recently.
@glenn-jocher you mean this? https://github.com/ultralytics/yolov5/blob/7a69035eb8a15f44a1dc8f1e07ee71b674e98271/utils/dataloaders.py#L988-L992 then I think we have it.
@Laughing-q ah, this is actually ok. This is a duplicate row check in a labels.txt file. This only cares about two labels being exactly the same in a single image (duplicate labels), it doesn't care about what's in each label.
@glenn-jocher yeah we dont unique
coordinates in each label, so everything's fine here.
Yes correct.
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!
@NSBCypher if you can train correctly with
--data coco128-seg.yaml
then there is a problem with your custom dataset. The roboflow team should be able to help you with that.
I can't train correctly with --data coco128-seg.yaml
and get same error but can train correctly with --data coco128.yaml
using detect model.
IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 28
@glenn-jocher @NSBCypher @Laughing-q
Could you please help to solve this issue about training segmentation yolo .
how can I convert ICDAR data format to segmentation training format?
The coordinate of text are 8 points coordinate. I want to convert it into yolov5 segmentation training data format.
Please help me.
The format of text file are
cls_id x1 y1 x2 y2 x3 y3 x4 y4
following
@liuhhx segmentation training works correctly. Follow tutorial notebook for usage examples: https://colab.research.google.com/github/ultralytics/yolov5/blob/master/segment/tutorial.ipynb
@WongVi the format you show is already in YOLOv5 segmentation format. See Colab notebook above for COCO128-seg training, you can view the label text files there for more details.
@glenn-jocher when I start training I got error. could you please let me know How can I solve it. I also want to ignore warnings by normalizing data. could you please share idea about it too.
I think due to the problem of normalization data I am facing this issue but I don't know any idea.
@WongVi 👋 Hello! Thanks for asking about YOLOv5 🚀 dataset formatting. To train correctly your data must be in YOLOv5 format. Please see our Train Custom Data tutorial for full documentation on dataset setup and all steps required to start training your first model. A few excerpts from the tutorial:
COCO128 is an example small tutorial dataset composed of the first 128 images in COCO train2017. These same 128 images are used for both training and validation to verify our training pipeline is capable of overfitting. data/coco128.yaml, shown below, is the dataset config file that defines 1) the dataset root directory path
and relative paths to train
/ val
/ test
image directories (or *.txt files with image paths) and 2) a class names
dictionary:
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128 # dataset root dir
train: images/train2017 # train images (relative to 'path') 128 images
val: images/train2017 # val images (relative to 'path') 128 images
test: # test images (optional)
# Classes (80 COCO classes)
names:
0: person
1: bicycle
2: car
...
77: teddy bear
78: hair drier
79: toothbrush
After using a tool like Roboflow Annotate to label your images, export your labels to YOLO format, with one *.txt
file per image (if no objects in image, no *.txt
file is required). The *.txt
file specifications are:
class x_center y_center width height
format.x_center
and width
by image width, and y_center
and height
by image height.The label file corresponding to the above image contains 2 persons (class 0
) and a tie (class 27
):
Organize your train and val images and labels according to the example below. YOLOv5 assumes /coco128
is inside a /datasets
directory next to the /yolov5
directory. YOLOv5 locates labels automatically for each image by replacing the last instance of /images/
in each image path with /labels/
. For example:
../datasets/coco128/images/im0.jpg # image
../datasets/coco128/labels/im0.txt # label
Good luck 🍀 and let us know if you have any other questions!
@glenn-jocher
@liuhhx segmentation training works correctly. Follow tutorial notebook for usage examples: https://colab.research.google.com/github/ultralytics/yolov5/blob/master/segment/tutorial.ipynb
@WongVi the format you show is already in YOLOv5 segmentation format. See Colab notebook above for COCO128-seg training, you can view the label text files there for more details.
Please just let me know how can I normalize x1,y1,x2,y2,x3,y3,x4,y4 labelme coordinate to train the segmentation module. I tried to find way but there is no any solution to train those format data and format changing methods too.
I trained the detection module with this dataset but for segmentation, I am facing errors as shown below. you just point out about detection training dataset format always.
Hi @glenn-jocher , I got the same error here (using the same example as OP but with my own dataset) .
IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 4
My labels are of the bounding-box format:
0 0.4825 0.2541 0.6678 0.3209
@f-izzat segmentation training requires segment labels, you have a box label shown. segment labels are cls, xy1, xy2, xy3, xy4, etc...
I am having same error. Surely there'es something wrong with my dataset as @glenn-jocher says, but the log showed is not specific enough.
Somebody could check what was the error with the dataset?
@hdnh2006 thanks for the bug report! Are you using a Roboflow segmentation dataset also? What is the error message?
@hdnh2006 thanks for the bug report! Are you using a Roboflow segmentation dataset also? What is the error message?
Hi @glenn-jocher it is a personal dataset. It wasn't downloaded from roboflow or any other sourcer. I was labeled using coco annotations and then transformed into yolo format using your json2yolo code.
I am trying to debug and I'll let you know any information.
Thanks in advance.
Hi @glenn-jocher , sorry yes. I just followed the roboflow steps and managed to get the correct format for instance segmentation i.e exported as YoloV5 format
Though the same error pops up
Traceback (most recent call last):
File "/content/drive/MyDrive/GIT/yolov5/segment/train.py", line 658, in <module>
main(opt)
File "/content/drive/MyDrive/GIT/yolov5/segment/train.py", line 554, in main
train(opt.hyp, opt, device, callbacks)
File "/content/drive/MyDrive/GIT/yolov5/segment/train.py", line 283, in train
for i, (imgs, targets, paths, _, masks) in pbar: # batch ------------------------------------------------------
File "/usr/local/lib/python3.7/dist-packages/tqdm/std.py", line 1195, in __iter__
for obj in iterable:
File "/content/drive/MyDrive/GIT/yolov5/utils/dataloaders.py", line 171, in __iter__
yield next(self.iterator)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 681, in __next__
data = self._next_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 461, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/content/drive/MyDrive/GIT/yolov5/utils/segment/dataloaders.py", line 114, in __getitem__
img, labels, segments = self.load_mosaic(index)
File "/content/drive/MyDrive/GIT/yolov5/utils/segment/dataloaders.py", line 262, in load_mosaic
border=self.mosaic_border) # border to remove
File "/content/drive/MyDrive/GIT/yolov5/utils/segment/augmentations.py", line 102, in random_perspective
new_segments = np.array(new_segments)[i]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 8 but corresponding boolean dimension is 14
Basically i labelled my images using labelme and imported to roboflow to convert into YoloV5 format. My dataset contains a mixture of polygons and rectangles Example of label file 1:
1 0.24140625 0.1296875 0.4546875 0.259375
2 0.734375 0.1296875 0.53125 0.259375
2 0.234375 0.6296875 0.46875 0.740625
1 0.734375 0.6296875 0.53125 0.740625
Example of label file 2:
1 0.2546875 0.29453125 0.509375 0.5890625
3 0.509375 0.5890625 1 0.5890625 1 0 0.509375 0 0.509375 0.5890625
1 0.2546875 0.79609375 0.509375 0.4078125
2 0.7546875 0.79453125 0.490625 0.4109375
Note: I ran the segment tutorial notebook and works just fine
@f-izzat yes this is because your format is incorrect. Training a segmentation model requires segmentation labels. It's not possible to mix labels from different tasks (you've got box and segment labels intermingled).
I successfully trained the segmentation network using the labeled me data format and compare results with yolo detection. but the segmentation methods result is lower than the detection methods.
@WongVi can you clarify?
@satpalsr @glenn-jocher Let's say there are two parts belong one object. We use a thin tiny line to connect multi-parts which make these become one object. Just follow the order of numbers(1 --> 15) to label segments. Actually the
4
and12
should at the same position(so are5
and11
), the two lines in the picture are for a more intuitive look. Btw the4
(12
) and5
(11
) should be the closest coordinates between two parts.
Not very clear on this. Since most data format actually separated polygons under the same grouping (object), how does this translate to yolov5-seg format? My concern is that wrong polygon information is given to the model during training for objects that have more than 1 polygon.
Maybe we can add a flag id at the end to indicate grouping?
If we are following the yolo-seg format now, you will not be able to visualise the polygons that are separated properly as you have already lost that information upon joining all the points together.
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!
@minazamani7 the thin lines displayed in the image are used to connect disjoint parts of an object and help the annotation process. In practice, you should not create a new shape by joining multiple disjoint polygons. Instead, each object should be represented as a separate contour. The order in which you number the segments does not matter for the training. If an object has multiple contours they should be labeled as separate segments (e.g., two slices of the same cucumber).
For your use case, you may want to consider using a dictionary approach where each object ID has a list of polygons that belong to it. Then, during training, you can convert the polygons of the same object to a mask for that object so that you do not lose this information during training.
Please note that the primary purpose of joining segments in YOLOv5 format is to represent deformations for a single object (e.g., the open-lid of a box) and should not be used to represent multiple objects.
Search before asking
YOLOv5 Component
No response
Bug
Hi guys!
When I follow this guide to train a segmented data set:
https://blog.roboflow.com/train-yolov5-instance-segmentation-custom-dataset/
I get the following error:
IndexError: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 16
I tried training locally and on Colab, go the same error.
Please find the full output below:
My dataset has polygons in this format, exported from roboflow:
8 0.9987424484374999 0.3347222234375 0.75713305625 0.29444444375 0.05729895625 0.89305555625 0.3336223390625 0.9847222234374999 0.9987424484374999 0.3347222234375
Environment
No response
Minimal Reproducible Example
No response
Additional
No response
Are you willing to submit a PR?