Closed billalkuet07 closed 4 months ago
@billalkuet07 hello,
Great question! In YOLOv5, the ground truth values from the .txt files undergo transformations before being used in training. These transformations include scaling and normalization to match the input size of the model, which is why you see different values in targets
.
Hereβs a breakdown:
This processing is essential for adapting the various image sizes and annotations to a standard format suitable for efficient training of the neural network. Hope this clears things up! π If you need more detailed insights, feel free to look into the data preprocessing steps in the code or visit our documentation at https://docs.ultralytics.com/yolov5/.
Best regards!
Thank you @glenn-jocher for your clarification. That answers my questions. However, could you please provide following additional information's:
My input image is 640*640 and labels inside the .txt are already normalized (with respect to h,w of image). That indicates, dataset[i] should matched with .txt file. Is this right?
Could you please mention the python methods (for example, xy method inside z.py) that used for the transformation dataset[i] and targets?
Thanks again for your time consideration.
Hello @billalkuet07,
I'm glad you found the initial explanations helpful! To address your additional queries:
Even though your labels from the .txt files are normalized, the dataset[i]
in YOLOv5 not only provides normalized labels but may involve some additional processing steps like augmentation (e.g., flipping, color adjustment) depending on the training configuration.
Specific transformations occur through methods defined predominantly in datasets.py
. The transformations involving conversion of these normalized labels to format suitable for training (like targets
) typically happen in components like the collate_fn
used in data loaders.
Feel free to dive deeper into the datasets.py
for more on how YOLOv5 handles and transforms data for training! π
Best wishes!
π Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO π and Vision AI β
Search before asking
Question
Hello,
I am trying to understand the YOLOV5 codes. Specially, how the detection is done and calculate loss in line 383 of train.py. I have noticed that the xywh values of variable 'targets' in line 383 of train.py is different from the ground truth in .txt file. The variable dataset.labels[i] from line 254 of train.py match with the values of .txt file. However, the values of dataset[i], targets and .txt is completely different. I have gone through the 'create_dataloader' and its not helping. May be I am missing something. Is there any transformation done? Could you please explain a brief about the relationship between the values of .txt, dataset[i], dataset.labels[i] and targets. How they are related?
Thank you in advance.
Additional
No response