Open topherbuckley opened 1 month ago
Hi @topherbuckley,
Unfortunately, the YoloV5 annotation format has no solid specification and may be implemented differently by different organisation. Yolo annotation format are generally very inconsistent contrary to other more stable formats (COCO, CVAT).
I don't think it would make sense to add the relative
parameter to the AnnotationSet.from_yolo_v5
method as this would basically make it almost a duplicate of the more generalist AnnotationSet.from_txt
method.
Here is two options for your use case:
AnnotationSet.from_txt
to manually specify the format.Thanks for your reply. (And maintaining this!)
Unfortunately, the YoloV5 annotation format has no solid specification and may be implemented differently by different organisation.
May I ask where did the decision to use the current default of relative=True come from? Roboflow datasets? Just curious what was used as the "standard" here since it appears the yolov5 Detections
source code was not.
I don't think it would make sense to add the relative parameter to the AnnotationSet.from_yolo_v5 method as this would basically make it almost a duplicate of the more generalist AnnotationSet.from_txt method.
Here is what I had to set other than relative=True
:
predictions = AnnotationSet.from_txt(
folder="./detections",
image_folder="./dataset/test",
box_format=BoxFormat.XYWH, #otherwise defaults to LTRB
relative=False,
conf_last=True, #otherwise defaults to False
)
In the end its not a huge deal either way now that I have it implemented, but I'd argue from other user's perspective that having to follow through the source to track down and find the defaults and override 3 of the defaults for the from_txt
one is more difficult than overriding 1 default in the from_yolo_v5
when they know that's what they need to change.
Is there any documentation that shows what all the exposed and non-exposed parameters resolve to by the time they get to BoundingBox.from_txt
? That would save people from having to trace it through the source at least. Might be easy enough to write a test script to log them and parse to a doc. Or maybe this is included in the verbose output?
@topherbuckley As far as I remember, the first YOLO annotation format was proposed by the Darknet framework (YOLO v1 to YOLO v3 era) and the coordinates were relative to the image size. This was for ground truths annotations at least. For predicted annotations the confidence was stored before the coordinates and just after the label id but I remember that the coordinates were still relative to the image size.
Things changed with the Ultralytics era of YOLO models (v5 to v7 and above).The confidence score for predictions is stored in last position in the annotation file. For the coordinates, I remember seing both relative and absolute forms, but I don't remember well, thus I sticked to relative ones.
There are two possible solutions:
relative
parameter to all the from_yolo
methods. from_yolo_result
method. In addition, I never wrote a complete documentation, I can improve it for sure!
Thanks for the background info. As I mentioned in the related discussion I'm happy to submit a PR for the latter of the two options, but can only promise knowledge on the ultralytics v5 results rather than any other variant you have built in here.
I don't have enough experience with this to know if this is an actual issue or not, but the first option may also be useful in its own right so users have the option to pull the detection either from disk or from memory depending on the size of the detection set. It may be favorable one way or another depending on the computer being used (i.e. memory limitations). Just a thought; for now happy to implement the from_yolo_result
for the time being.
In general if you want to implement a new parser you have to implement the whole stack:
BoundingBox.from_yolov5_result
, which can be implemented using BoundingBox.from_txt
Annotation.from_yolov5_result
, which can also be implemented using Annotation.from_txt
AnnotationSet.from_yolov5_result
, which can be implemented in the same wayImplementing the whole stack allows more flexibility.
- from_yolov7_result
Guess this is a typo, but otherwise understood. Thanks I'll open a PR when ready. Do you have any tests specific to these that I should be aware of?
Yes its a typo, though I think that YOLOv5 and YOLOv7 use the same format.
If you want to add some tests, you can also submit a PR to globox_test_data to add some YOLOv5 results annotation that will be used in the tests. In this data repo, you can put the detection files in the dets
subfolder. Please note that the tests checks if all the detections of that folder are equivalent (same images, same number of boxes, etc.), thus, you should add predictions that matches the ones already present in that folder.
The most simple way to achieve that is to also create method to export annotations to your new format (BoundingBox.to_yolo_v5
and so on). You can then convert an existing dataset (for instance dets/abs_ltrb
) to your new format and manually check that it is correct.
You can then implement parsing and conversion tests, respectively in tests/test_parsing
and tests/test_conversion
. This should be quite straightforward, just add the relevant dataset path in constants.py
and then add the detections to the dets_sets
variable in both tests.
Thanks for the info on tests. I'll take a closer look at that sometime next week.
Just had a quick look through BoundingBox
with regards to :
BoundingBox.from_yolov5_result, which can be implemented using BoundingBox.from_txt
Do you want me to avoid using BoundingBox.from_yolo and create BoundingBox.from_yolo_result
for other potential future result parsers and to follow your current pattern? Or implement the [BoundingBox.from_txt]
directly into BoundingBox.from_yolo5_result
?
The reason I ask is that the from_yolo
uses relative=True
, which I can get from the yolo5 Detections.xywhn object, and parsing the needed image_size
is also possible, but it would be less straightforward via manipulating the tuple representing the Detections.ims[0].shape
, than just using the relative=False
and neglecting image_size
Sorry for the delay, you can implement whichever your think is more convenient. I personally thing that implementing everything from .from_txt
is the most straightforward way of doing it.
Describe the bug Keeping this definition here as I find the word "relative" ambiguous. As written in the source here
yolov5 models appear to return absolute values, whereas datasets exported in yolov5-pytorch from roboflow are given in relative coords.
AnnotationSet.from_yolo_v5 should specify or at least allow the specification for the default
relative=False
else it will wrongly be set to True here when run on yolov5 inference detections, but correctly when run on groundtruth datasets exported by roboflow in the yolov5-pytorch format.To Reproduce
Expected behavior I expected the default value for both detections from yolo_v5 and datasets downloaded from places like roboflow to have correct default values for
relative
such that coords are converted only when necessary to absolute ones.Environment (please complete the following information): Ubuntu 22.04 python 3.10.12 globbox 2.4.6 yolov5 (repo on master currently at HASH a3555241)
Additional context As per this discussion which goes into great detail on how I determined this to be the case: