ultralytics / ultralytics

NEW - YOLOv8 πŸš€ in PyTorch > ONNX > OpenVINO > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
28.77k stars 5.71k forks source link

YOLOv8 POSE TRAIN CODE IS USING VAL DATASET NOT THE TRAIN DATASET #9532

Closed benicio22 closed 4 months ago

benicio22 commented 5 months ago

Search before asking

Question

Hi, I am running my pose yolov8 train, but the train code is using the val dataset to train my model not the train dataset. The YAML is correct, showing the proper place of the training dataset. But still, I need help with this. Have 400 images for the train and 100 for val. The code is running in Pycharm.

Using this code here:

From paralytics import YOLO

Load a model Pose

model = YOLO('yolov8n-pose.pt') # load a pre-trained model (recommended for training)

Train the model

results = model.train(data='C:/Users/indigo/Box/UIUC_ACADEMIC/PHD_THESIS/PROJECTS/GAIT_CATTLE/CODES/POSE_MODELV8/dataset/dataset.yaml', epochs=20)

Validate the model

metrics = model.val() # no arguments needed, dataset and settings remembered

and my yaml file are like this:

Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]

path: C:/Users/idigi/Box/UIUC_ACADEMIC/PHD_THESIS/PROJECTS/GAIT_CATTLE/CODES/POSE_MODELV8/dataset # dataset root dir

train: C:/Users/idigi/Box/UIUC_ACADEMIC/PHD_THESIS/PROJECTS/GAIT_CATTLE/CODES/POSE_MODELV8/dataset/train val: C:/Users/idigi/Box/UIUC_ACADEMIC/PHD_THESIS/PROJECTS/GAIT_CATTLE/CODES/POSE_MODELV8/dataset/val

Keypoints

kpt_shape: [16, 2] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)

flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]

Classes dictionary

names: 0: cattle_keypoint

thank you,

Additional

No response

github-actions[bot] commented 5 months ago

πŸ‘‹ Hello @benicio22, thank you for your interest in Ultralytics YOLOv8 πŸš€! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a πŸ› Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 5 months ago

@benicio22 hi there! πŸ‘‹ It seems there might be a confusion with how the data is being loaded for training. Your code and YAML configurations look good at first glance. One possible reason for this issue could be related to how the paths are defined or accessed in the training process.

Could you double-check the data argument path in your script to ensure it matches exactly with the dataset.yaml path? Also, ensure that there's no caching or overriding of paths that might cause the training to pick the validation dataset instead.

A quick tip is to add some logging or print statements in your training script to confirm which datasets are being loaded for training and validation. This can help diagnose if the correct paths are being utilized.

If after checking these, the issue still persists, it might be helpful to share additional details like any console output or errors during the training process. This way, we can provide more targeted assistance.

Keep us posted!

benicio22 commented 5 months ago

Hi, I did what you said. I checked the dataset from test and train; both folders have different images. I inserted the folder that contains my dataset into the pycharm. This is the ouput that I am having when I train my model (I choose just one epoch to check if was using the 400 images not the 10 0images). C:\Users\idigi\Documents\EMBRAPA_PROJECT_2023\venv\Scripts\python.exe C:\Users\idigi\Documents\EMBRAPA_PROJECT_2023\GAIT_CATTLE_2024\POSE_YOLOV8_TRAIN.py Data Argument Path: C:/Users/idigi/Box/UIUC_ACADEMIC/PHD_THESIS/PROJECTS/GAIT_CATTLE/CODES/POSE_MODELV8/dataset/dataset.yaml Training with dataset: C:/Users/idigi/Box/UIUC_ACADEMIC/PHD_THESIS/PROJECTS/GAIT_CATTLE/CODES/POSE_MODELV8/dataset/dataset.yaml New https://pypi.org/project/ultralytics/8.1.43 available πŸ˜ƒ Update with 'pip install -U ultralytics' Ultralytics YOLOv8.1.42 πŸš€ Python-3.11.3 torch-2.0.1+cpu CPU (12th Gen Intel Core(TM) i7-1255U) engine\trainer: task=pose, mode=train, model=yolov8n-pose.pt, data=C:/Users/idigi/Box/UIUC_ACADEMIC/PHD_THESIS/PROJECTS/GAIT_CATTLE/CODES/POSE_MODELV8/dataset/dataset.yaml, epochs=1, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train2, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=runs\pose\train2 Overriding model.yaml kpt_shape=[17, 3] with kpt_shape=[16, 2]

               from  n    params  module                                       arguments                     

0 -1 1 464 ultralytics.nn.modules.conv.Conv [3, 16, 3, 2]
1 -1 1 4672 ultralytics.nn.modules.conv.Conv [16, 32, 3, 2]
2 -1 1 7360 ultralytics.nn.modules.block.C2f [32, 32, 1, True]
3 -1 1 18560 ultralytics.nn.modules.conv.Conv [32, 64, 3, 2]
4 -1 2 49664 ultralytics.nn.modules.block.C2f [64, 64, 2, True]
5 -1 1 73984 ultralytics.nn.modules.conv.Conv [64, 128, 3, 2]
6 -1 2 197632 ultralytics.nn.modules.block.C2f [128, 128, 2, True]
7 -1 1 295424 ultralytics.nn.modules.conv.Conv [128, 256, 3, 2]
8 -1 1 460288 ultralytics.nn.modules.block.C2f [256, 256, 1, True]
9 -1 1 164608 ultralytics.nn.modules.block.SPPF [256, 256, 5]
10 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
11 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1]
12 -1 1 148224 ultralytics.nn.modules.block.C2f [384, 128, 1]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1]
15 -1 1 37248 ultralytics.nn.modules.block.C2f [192, 64, 1]
16 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2]
17 [-1, 12] 1 0 ultralytics.nn.modules.conv.Concat [1]
18 -1 1 123648 ultralytics.nn.modules.block.C2f [192, 128, 1]
19 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2]
20 [-1, 9] 1 0 ultralytics.nn.modules.conv.Concat [1]
21 -1 1 493056 ultralytics.nn.modules.block.C2f [384, 256, 1]
22 [15, 18, 21] 1 911731 ultralytics.nn.modules.head.Pose [1, [16, 2], [64, 128, 256]]
YOLOv8n-pose summary: 250 layers, 3171267 parameters, 3171251 gradients, 8.8 GFLOPs

Transferred 361/397 items from pretrained weights TensorBoard: Start with 'tensorboard --logdir runs\pose\train2', view at http://localhost:6006/ Freezing layer 'model.22.dfl.conv.weight' WARNING ⚠️ No 'flip_idx' array defined in data.yaml, setting augmentation 'fliplr=0.0' train: Scanning C:\Users\idigi\Box\UIUC_ACADEMIC\PHD_THESIS\PROJECTS\GAIT_CATTLE\CODES\POSE_MODELV8\dataset\labels\train.cache... 391 images, 0 backgrounds, 0 corrupt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 391/391 [00:00<?, ?it/s] val: Scanning C:\Users\idigi\Box\UIUC_ACADEMIC\PHD_THESIS\PROJECTS\GAIT_CATTLE\CODES\POSE_MODELV8\dataset\labels\val.cache... 100 images, 2 backgrounds, 0 corrupt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 100/100 [00:00<?, ?it/s] Plotting labels to runs\pose\train2\labels.jpg... optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... optimizer: AdamW(lr=0.002, momentum=0.9) with parameter groups 63 weight(decay=0.0), 73 weight(decay=0.0005), 72 bias(decay=0.0) TensorBoard: model graph visualization added βœ… 0%| | 0/25 [00:00<?, ?it/s]Image sizes 640 train, 640 val Using 0 dataloader workers Logging results to runs\pose\train2 Starting training for 1 epochs...

  Epoch    GPU_mem   box_loss  pose_loss  kobj_loss   cls_loss   dfl_loss  Instances       Size
    1/1         0G     0.7977      10.33          0     0.9925       1.04         12        640: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 25/25 [02:15<00:00,  5.42s/it]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Pose(P          R      mAP50  mAP50-95): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:09<00:00,  2.37s/it]
               all        100         98      0.977          1      0.981      0.882          0          0          0          0

1 epochs completed in 0.042 hours. Optimizer stripped from runs\pose\train2\weights\last.pt, 6.6MB Optimizer stripped from runs\pose\train2\weights\best.pt, 6.6MB

Validating runs\pose\train2\weights\best.pt... Ultralytics YOLOv8.1.42 πŸš€ Python-3.11.3 torch-2.0.1+cpu CPU (12th Gen Intel Core(TM) i7-1255U) YOLOv8n-pose summary (fused): 187 layers, 3165875 parameters, 0 gradients, 8.7 GFLOPs Class Images Instances Box(P R mAP50 mAP50-95) Pose(P R mAP50 mAP50-95): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:08<00:00, 2.14s/it] all 100 98 0.977 1 0.981 0.882 0 0 0 0 Speed: 0.6ms preprocess, 56.9ms inference, 0.0ms loss, 0.8ms postprocess per image Results saved to runs\pose\train2

Process finished with exit code 0

glenn-jocher commented 5 months ago

Hey @benicio22! πŸ˜„ From the logs you've shared, it looks like your model is indeed training on the correct datasets with 391 images for training and 100 for validation, as expected. The train: and val: scans in the log confirm the datasets are loaded properly, and the paths are correct.

For the specific concern about training on the validation dataset, the output indicates that your model is training on the training dataset and using the validation dataset as intended. The validation process, shown as β€œValidating runs\pose\train2\weights\best.pt...”, runs after the training epoch, which is normal and expected behavior.

If you are observing unexpected results or behaviors not aligning with this, double-checking the dataset content and the model's output during training and validation could be helpful. Sometimes visualizing some predictions can give insights into what the model is learning. Keep experimenting, and feel free to reach out if you have more questions or updates!

benicio22 commented 5 months ago

Thanks for answer. So, why my confusion matrix in the train folder has the total of the val images not the train images? And it is not saving my val folder with the outputs? What I got confuse is that while I am running the model in each epochs is showing 100 images (val images). I remember before my model was saving the train confusion matrix (but to be honest I used last year the yolov8) Thank you

glenn-jocher commented 5 months ago

@benicio22 hey there! 😊

The confusion matrix and your validation (val) image count appearing during training is entirely normal and part of YOLOv8's process. When training, YOLOv8 evaluates on the validation dataset after each epoch to measure model performance. Hence, the mention of 100 images (your val set) during the epochs.

The confusion matrix is generated using the validation dataset for a more accurate representation of how the model might perform on unseen data. It’s expected behavior for it not to tally up with the train dataset size.

Regarding the saving of outputs in the val folder, YOLOv8 default behavior has evolved, focusing more on streamline training and validation process. If you need to save validation outputs specifically, you might want to check the arguments/options available or custom scripts used previously.

For saving outputs explicitly, here's a quick reference:

model.val(save_json=True, save_dir='your_desired_path_for_outputs')

Replace 'your_desired_path_for_outputs' with the actual path where you'd like to save those outputs.

I hope this clears things up! Let me know if you have more questions or if there's anything else you're curious about. Happy modeling! πŸš€

benicio22 commented 5 months ago

Oh Gotcha! Thank you so much.

glenn-jocher commented 5 months ago

@benicio22 you're welcome! If you have any more questions or run into any issues, feel free to reach out. Happy coding! 😊

github-actions[bot] commented 4 months ago

πŸ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO πŸš€ and Vision AI ⭐