utterances-bot commented 1 year ago

Christian Mills - Training YOLOX Models for Real-Time Object Detection in Pytorch

Learn how to train YOLOX models for real-time object detection in PyTorch by creating a hand gesture detection model.

https://christianjmills.com/posts/pytorch-train-object-detector-yolox-tutorial/

aforadil commented 1 year ago

Hi. I am actually new to machine learning and following your tutorials for a project. I just wanted to confirm that I can export the model to ONXX by following the next tutorial and then I can use it in Unity using the Barracuda Packages for yolox that you have developed or by following the tutorial (Real-Time Object Detection in Unity With ONNX and DirectML Pt. 1). Am I right?

cj-mills commented 1 year ago

Hi @aforadil,

That's correct. You can export the model with the code in the follow-up tutorial. Use the specified settings for Barracuda in this section if you plan to use the model with either the ONNX-DirectML tutorial or the Barracuda YOLOX package.

You can swap the model and colormap file in the Barracuda YOLOX demo project without any additional changes.

The code in the ONNX-DirectML tutorial applies normalization to the model input, so you'll need to remove the normalization steps (i.e., the subtraction and division operations) from the PerformInference function in the dllmain.cpp file.

Also, use the TensorFlow.js plugin if you need to run the model in a browser. The model runs with Barracuda in WebGL builds, but the inference speed is slower and has weird glitches when changing input dimensions.

aforadil commented 1 year ago

Thanks for a quick response @cj-mills.

aforadil commented 1 year ago

Hi @cj-mills, I was following your tutorial. The code runs fine in Collab. When I tried running it in a Local Python Environment with Mamba, following error comes up in Prompt when the "Train the Model" cell is executed. I have tried executing all in a sequence but still error persists. Looks like some issue with multiprocessing. Would be great if you can help out in this. Thanks Error: [I 17:51:29.357 NotebookApp] Traceback (most recent call last): File "", line 1, in File "C:\Users\Madil2\mambaforge\envs\pytorch-env\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "C:\Users\Madil2\mambaforge\envs\pytorch-env\lib\multiprocessing\spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) AttributeError: Can't get attribute 'HagridDataset' on <module 'main' (built-in)>

cj-mills commented 1 year ago

Hi @aforadil,

Python Multiprocessing on Windows is slightly weird and even more so with Jupyter notebooks.

The fix for the AttributeError is to place the HagridDataset class in a separate Python file and import it. I just added a version of the training notebook for Windows to the GitHub repository.

pytorch-yolox-object-detector-training-windows.ipynb

I'll add a note to the tutorial to use that one when running on Windows.

Edit: Also download the windows_utils.py file and place it in the same folder as the notebook.

notebooks/windows_utils.py

aforadil commented 1 year ago

Great. Thanks.

Xantango commented 1 year ago

Hi... can't i just use the standard coco annotation json file to train my custom model? As my model is having images with multiple classes, and i noticed that yours is pointing each json file to a class folder. Thanks in advance :)

Xantango commented 1 year ago

By the way, following previous question, i'm already having a custom model trained on Yolov8. however, when i tried to use it with Unity (after exporting it to Onnx), it gaves me some Reshape issue. but when i used your model, it was working perfectly. that's why i want to train my custom data with yolox. I just mention this, as you might having an answer for my main issue :)

cj-mills commented 1 year ago

Hi @Xantango,

For your first question, you can use whichever annotation format you wish, but you'll need to update the relevant code in the notebook for that specific format. The method to read the annotation data used in the tutorial is simply what the HaGRID dataset requires.

I have a notebook in this tutorial's GitHub Repository for working with the COCO dataset. I need to clean it up slightly, but the code is functional.

pytorch-yolox-object-detector-training-coco.ipynb

I am debating whether the COCO notebook warrants a dedicated post to cover the changes in how to read the annotation data.

As for your second question, I assume you are trying to use the model with Unity's Barracuda inference library. Barracuda has limited ONNX operator support, and part of the exported YOLOv8 model likely has unsupported operators. That is why I have different export settings for Barracuda in the follow-up post.

Sentis, the replacement for Barracuda, should have better operator support. Unfortunately, Sentis is only available through a closed beta program, and I only got access this week. I plan to port my unity-barracuda-inference-yolox package to Sentis to compare against Barracuda, but I have not had a chance to do so yet.

Xantango commented 1 year ago

Hi @cj-mills, Very informative. Thank you sooo much.

Xantango commented 1 year ago

Hi @cj-mills, This tutorial's code and pytorch-yolox-object-detector-training-coco.ipynb are running perfectly fine on colab. However, when i ran them on Jupyter, including pytorch-yolox-object-detector-training-windows.ipynb, the training section keep running forever, with neither errors nor progress. Any idea?

Xantango commented 1 year ago

Uhhh, it was the attribute issue

cj-mills commented 1 year ago

@Xantango Were you able to resolve the issue you encountered? I don't have a version of the COCO notebook prepared for Windows, so you would need to make the same changes as in the notebook for the HaGRID dataset and place the windows_utils.py file in the same folder.

Xantango commented 1 year ago

Hi @cj-mills, yes, i have combined parts from this tutorial, COCO version, and windows version, to suit what i needed. and it is working perfectly fine. thank you soooo much for the great well explained tutorial.

Zafoue commented 11 months ago

Hello , After Runing the program in Jupyter i have this issue after runing dataset_sample = train_dataset[0]

annotated_tensor = draw_bboxes( image=(denorm_img_tensor(dataset_sample[0], norm_stats)255).to(dtype=torch.uint8), boxes=dataset_sample[1]['boxes'], labels=[class_names[int(i.item())] for i in dataset_sample[1]['labels']], colors=[int_colors[int(i.item())] for i in dataset_sample[1]['labels']] )

tensor_to_pil(annotated_tensor)

"NameError Traceback (most recent call last) Cell In[32], line 1 ----> 1 dataset_sample = train_dataset[0] 3 annotated_tensor = draw_bboxes( 4 image=(denorm_img_tensor(dataset_sample[0], norm_stats)255).to(dtype=torch.uint8), 5 boxes=dataset_sample[1]['boxes'], 6 labels=[class_names[int(i.item())] for i in dataset_sample[1]['labels']], 7 colors=[int_colors[int(i.item())] for i in dataset_sample[1]['labels']] 8 ) 10 tensor_to_pil(annotated_tensor)

File ~\YOLOX\windows_utils.py:61, in HagridDataset.getitem(self, index) 59 annotation = self._annotation_df.loc[img_key] 60 # Load the image and its target (bounding boxes and labels) ---> 61 image, target = self._load_image_and_target(annotation) 63 # Apply the transformations, if any 64 if self._transforms:

File ~\YOLOX\windows_utils.py:89, in HagridDataset._load_image_and_target(self, annotation) 87 bbox_tensor = torchvision.ops.box_convert(torch.Tensor(bbox_list), 'xywh', 'xyxy') 88 # Create a BoundingBox object with the bounding boxes ---> 89 boxes = BoundingBox(bbox_tensor, format='xyxy', canvas_size=image.size[::-1]) 90 # Convert the class labels to indices 91 labels = torch.Tensor([self._class_to_idx[label] for label in annotation.labels])

NameError: name 'BoundingBox' is not defined"

cj-mills commented 11 months ago

Hi @Zafoue,

Sorry about that. I must have forgotten to push the updated windows_utils.py file for torchvision 0.16+ to GitHub. I have done so now, and you can download it from the link below:

windows_utils.py

Zafoue commented 11 months ago

Hi @cj-mills

I managed to get it to work on Jupyter .

Do you have a tutorial for creating a custom dataset?

cj-mills commented 11 months ago

Hi @Zafoue,

I have not made a tutorial for creating custom datasets yet. I've received several questions about this, so I'll try to make time for one.

If you need something now, there are free annotation tools like CVAT and automated annotation methods, as shown in the following videos:

Zafoue commented 11 months ago

Hi @cj-mills ,

I use RoboFlow for labeling images, but my question is, in which format do I need to export? I looked at your dataset "hagrid-sample-30k-384p," and I noticed there's one labeling file for a set of images. I've tried the yolov8 format and some others, but I always end up with one labeling file per image. As a result, I have multiple labeling files, which doesn't match the format you use in the scripts.

Could you advise me on the correct export format or guide me on how to get a single labeling file for a set of images?

Thank you very much!

ryan-michaud commented 9 months ago

Christian, what a beast of a tutorial! I would like to reproduce this on my own data. How were you able to become so proficient at this ? Would you be able to recommend some learning resources for a python noob?

My background is electrical engineering. Decent proficiency in Matlab, not much Python.

I am thinking :

Leveling up Python skills with some general online course, then
Doing a course such as Fast.ai, or Deep Learning on Coursera?

There is just so much learning material out there, I'm not sure where to start and how to be efficient about it. My immediate interest is in fine tuning object detection models on my own data (as you have done here). If you have any suggestions, it would be greatly appreciated.

cj-mills commented 9 months ago

Hi @ryan-michaud,

The fast.ai courses are my go-to recommendation for getting started with deep learning. If you have some existing coding experience, you can probably make it through without already being proficient in Python.

If you want to build a foundation with Python first, I don't have a default tutorial to recommend (there are so many these days). However, the latest CS50x course is probably a safe bet:

CS50x 2024 - Lecture 6 - Python

As for the fast.ai courses, part 2 is optional but highly recommended. It is valuable even just from a Python development standpoint.

Part 1: Practical Deep Learning for Coders
Part 2: From Deep Learning Foundations to Stable Diffusion

I recommend having a project to work on while going through the fast.ai course. You seem to have an object detection project in mind, so that's good.

The latest iteration of the courses doesn't go in-depth on object detection specifically. However, an earlier version does:

Cutting Edge Deep Learning for Coders 2

As for your project, I made some tutorials recently for working with various bounding box annotations in PyTorch, which you might find applicable:

Torchvision Annotation Tutorials

If you would like to explore the YOLOX model specifically, you can view the code for my pip package in the following GitHub repository:

cjm-yolox-pytorch

Above all, don't feel you need to wait until you think you have enough theoretical knowledge to start working on your project. Learn what you need as you need to.

Afterward, if you want to test your understanding, make a tutorial. Teaching others how to do something does wonders for identifying any relevant gaps in your knowledge. 😉

ryan-michaud commented 9 months ago

@cj-mills Thank you for such a detailed and thoughtful reply. I hope to give back one day as you have ;-)

cj-mills commented 9 months ago

@ryan-michaud You're very welcome! Best of luck with your project!

Boo2z commented 8 months ago

Hi @cj-mills,there is a version of this project for Windows on Github called pytorch-yolox-object-detector-training-windows.ipynb, how can I modify it to run it on Linux?Do I need to change just this line? "from windows_utils import hagriddataset" or something else?Thank you)

cj-mills commented 8 months ago

Hi @Boo2z,

There is already a Linux version of the notebook available for this tutorial. You can find links for the available notebooks in the Tutorial Code dropdown under the Getting Started with the Code section:

Getting Started with the Code

OlegKochetkov-code commented 6 months ago

Hi @cj-mills. I plan to train the model using this notebook, and then use my trained model in this sentis demo, only roll back the Unity version to 2022, and leave Sentis at version 1.3. After Unity moved to 2022, your project worked well. Should my trained model work without problems? Or theoretically there could be difficulties? Thank you in advance.

cj-mills commented 6 months ago

Hi @OlegKochetkov-git, I believe Sentis 1.3 (and the new 1.4) only officially supports Unity 2023.2 or newer:

Documentation: Install Sentis

hericah commented 2 months ago

Thanks for your great tutorial. May I know which need to change if we want use datasets from roboflow? Pascal VOC?

cj-mills commented 2 months ago

Hi @hericah,

If your dataset is currently in roboflow, you can export the annotations to COCO JSON format. You can also use roboflow to convert existing annotations from Pascal VOC to COCO. From there, you can check out how to work with COCO bounding box annotations in the tutorial linked below:

Working with COCO Bounding Box Annotations in Torchvision

hericah commented 2 months ago

I got this error when running it on M1 Mac

{
    "name": "PicklingError",
    "message": "Can't pickle <function <lambda> at 0x16da18c10>: attribute lookup <lambda> on __main__ failed",
    "stack": "---------------------------------------------------------------------------
PicklingError                             Traceback (most recent call last)
Cell In[39], line 1
----> 1 train_loop(
      2     model=model, 
      3     train_dataloader=train_dataloader, 
      4     valid_dataloader=valid_dataloader, 
      5     optimizer=optimizer, 
      6     loss_func=yolox_loss,  
      7     lr_scheduler=lr_scheduler, 
      8     device=torch.device(device), 
      9     epochs=epochs, 
     10     checkpoint_path=checkpoint_path, 
     11     use_scaler=True)

Cell In[34], line 36, in train_loop(model, train_dataloader, valid_dataloader, optimizer, loss_func, lr_scheduler, device, epochs, checkpoint_path, use_scaler)
     33 # Loop over the epochs
     34 for epoch in tqdm(range(epochs), desc=\"Epochs\"):
     35     # Run a training epoch and get the training loss
---> 36     train_loss = run_epoch(model, train_dataloader, optimizer, lr_scheduler, loss_func, device, scaler, epoch, is_training=True)
     37     # Run an evaluation epoch and get the validation loss
     38     with torch.no_grad():

Cell In[33], line 24, in run_epoch(model, dataloader, optimizer, lr_scheduler, loss_func, device, scaler, epoch_id, is_training)
     21 progress_bar = tqdm(total=len(dataloader), desc=\"Train\" if is_training else \"Eval\")  # Initialize a progress bar
     23 # Loop over the data
---> 24 for batch_id, (inputs, targets) in enumerate(dataloader):
     25     # Move inputs and targets to the specified device
     26     inputs = torch.stack(inputs).to(device)
     27     # Extract the ground truth bounding boxes and labels

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/site-packages/torch/utils/data/dataloader.py:440, in DataLoader.__iter__(self)
    438     return self._iterator
    439 else:
--> 440     return self._get_iterator()

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/site-packages/torch/utils/data/dataloader.py:388, in DataLoader._get_iterator(self)
    386 else:
    387     self.check_worker_number_rationality()
--> 388     return _MultiProcessingDataLoaderIter(self)

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/site-packages/torch/utils/data/dataloader.py:1038, in _MultiProcessingDataLoaderIter.__init__(self, loader)
   1031 w.daemon = True
   1032 # NB: Process.start() actually take some time as it needs to
   1033 #     start a process and pass the arguments over via a pipe.
   1034 #     Therefore, we only add a worker to self._workers list after
   1035 #     it started, so that we do not call .join() if program dies
   1036 #     before it starts, and __del__ tries to join but will get:
   1037 #     AssertionError: can only join a started process.
-> 1038 w.start()
   1039 self._index_queues.append(index_queue)
   1040 self._workers.append(w)

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/multiprocessing/process.py:121, in BaseProcess.start(self)
    118 assert not _current_process._config.get('daemon'), \\
    119        'daemonic processes are not allowed to have children'
    120 _cleanup()
--> 121 self._popen = self._Popen(self)
    122 self._sentinel = self._popen.sentinel
    123 # Avoid a refcycle if the target function holds an indirect
    124 # reference to the process object (see bpo-30775)

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/multiprocessing/context.py:224, in Process._Popen(process_obj)
    222 @staticmethod
    223 def _Popen(process_obj):
--> 224     return _default_context.get_context().Process._Popen(process_obj)

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/multiprocessing/context.py:288, in SpawnProcess._Popen(process_obj)
    285 @staticmethod
    286 def _Popen(process_obj):
    287     from .popen_spawn_posix import Popen
--> 288     return Popen(process_obj)

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/multiprocessing/popen_spawn_posix.py:32, in Popen.__init__(self, process_obj)
     30 def __init__(self, process_obj):
     31     self._fds = []
---> 32     super().__init__(process_obj)

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/multiprocessing/popen_fork.py:19, in Popen.__init__(self, process_obj)
     17 self.returncode = None
     18 self.finalizer = None
---> 19 self._launch(process_obj)

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/multiprocessing/popen_spawn_posix.py:47, in Popen._launch(self, process_obj)
     45 try:
     46     reduction.dump(prep_data, fp)
---> 47     reduction.dump(process_obj, fp)
     48 finally:
     49     set_spawning_popen(None)

File /opt/homebrew/Caskroom/miniconda/base/envs/pytorch-env/lib/python3.10/multiprocessing/reduction.py:60, in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)

PicklingError: Can't pickle <function <lambda> at 0x16da18c10>: attribute lookup <lambda> on __main__ failed"
}

cj-mills commented 2 months ago

Hi @hericah,

The issue stems from the difference between how multiprocessing in Python works on Linux and macOS.

Try using the Windows training notebook and associated utility file.

Alternatively, you can set the value for num_workers in data_loader_params to 0 to disable multiprocessing.

carlos-saldana commented 1 month ago

Hi Christian and thanks for your excellent tutorial and support to the community. You're the best !!! I have a question and I hope you can help me please.

I'm following your tutorial, using a custom dataset generated with labelme (I followed your tutorial Working with LabelMe Bounding Box Annotations in Torchvision). But I have an error when running the Inspect Samples (Inspect training set sample and Inspect validation set sample), I get:

AttributeError: 'Series' object has no attribute 'bboxes'.

I've been investigating and I think that error starts in the HagridDataset(Dataset) class, exactly in the command: bbox_list = np.array([bbox(image.size2) for bbox in annotation.bboxes]).

I checked your json file and saw that the 4 points of the bounding box are inside bboxes, while in my json file the 4 points of the bounding box are in shape -> points, I think that generates the error, but so far I haven't been able to solve it.

I hope you can help me and I'll be attentive, again thank you very much for your excellent tutorials!!

carlos-saldana commented 1 month ago

@cj-mills can u help me please 🙏

cj-mills commented 1 month ago

Hi @carlos-saldana,

The HaGRID dataset I used for this tutorial does not use the LabelMe annotation format.

If you want to adapt this training tutorial to work with a LabelMe dataset like in the Working with LabelMe Bounding Box Annotations in Torchvision tutorial, replace the HagridDataset class (and other code used for interacting with the HaGRID annotation data) with the corresponding code from the Labelme annotation tutorial.

I tried to keep the heading names consistent across the tutorials to make matching up the code for each section easier.

carlos-saldana commented 1 month ago

Hi @cj-mills, Thank you very much, I will review the code again!!

moelk-atea commented 3 weeks ago

Hi @cj-mills!

I've followed your notebook for custom traning yolo-x models, after training I load the model and try to generate predictions on new image. But the model labels all the predictions wrong and some objects are not even being detected, for example trees.

Here is how I load and run inference on the finetuned model,

ckpt_fold = 'ckpts/yolox_m/2024-10-25_08-43-44/'
# ckpt_fold = './pretrained_checkpoints/'
model_type = 'yolox_m'
pretrained = True

model = build_model(model_type, 8, pretrained=pretrained, checkpoint_dir=ckpt_fold).to(device=device, dtype=dtype)
model.device = device
model.name = model_type

strides = model.bbox_head.strides
norm_stats = ([0.5]*3, [1.0]*3)

# Convert the normalization stats to tensors
mean_tensor = torch.tensor(norm_stats[0]).view(1, 3, 1, 1)
std_tensor = torch.tensor(norm_stats[1]).view(1, 3, 1, 1)

# Set the model to evaluation mode
model.eval();

# Wrap the model with preprocessing and post-processing steps
wrapped_model = YOLOXInferenceWrapper(model, mean_tensor, std_tensor).to(device=device)

test_img = Image.open('test_images/test_image_2.jpg')
train_sz = 512
bbox_conf_thresh = 0.20
iou_thresh = 0.25

draw_bboxes = partial(draw_bounding_boxes, fill=False, width=2, font_size=25)

class_names=classes_to_keep

# Resize image without cropping to multiple of the max stride
resized_img = resize_img(test_img, target_sz=train_sz, divisor=1)

# Calculating the input dimensions that multiples of the max stride
input_dims = [dim - dim % max(strides) for dim in resized_img.size]

# Calculate the offsets from the resized image dimensions to the input dimensions
offsets = (np.array(resized_img.size) - input_dims)/2

# Calculate the scale between the source image and the resized image
min_img_scale = min(test_img.size) / min(resized_img.size)

# Crop the resized image to the input dimensions
input_img = resized_img.crop(box=[*offsets, *resized_img.size-offsets])

input_tensor = transforms.Compose([transforms.ToImage(), 
                                   transforms.ToDtype(torch.float32, scale=True)])(input_img)[None].to(device)
wrapped_model.to(device)

with torch.no_grad():
    model_output = wrapped_model(input_tensor).to('cpu')

# Filter the proposals based on the confidence threshold
max_probs = model_output[:, : ,-1]
mask = max_probs > bbox_conf_thresh
proposals = model_output[mask]

# Sort the proposals by probability in descending order
proposals = proposals[proposals[..., -1].argsort(descending=True)]

# Filter bouning box proposals using NMS
proposal_indices = torchvision.ops.nms(
    boxes=torchvision.ops.box_convert(proposals[:, :-2], 'xywh', 'xyxy'), 
    scores=proposals[:, -1], 
    iou_threshold=iou_thresh
)
proposals = proposals[proposal_indices]

# Offset and scale the predicted bounding boxes
pred_bboxes = (proposals[:,:4]+torch.Tensor([*offsets, 0, 0]))*min_img_scale

pred_labels = [class_names[int(idx)] for idx in proposals[:,4]]
pred_probs = proposals[:,5]

annotated_tensor = draw_bboxes(
    image=transforms.PILToTensor()(test_img), 
    boxes=torchvision.ops.box_convert(torch.Tensor(pred_bboxes), 'xywh', 'xyxy'), 
    labels=[f"{label}\n{prob*100:.2f}%" for label, prob in zip(pred_labels, pred_probs)], 
    colors=[int_colors[class_names.index(i)] for i in pred_labels]
)

cj-mills commented 3 weeks ago

Hi @moelk-atea,

The checkpoint_dir parameter for the build_model method is only for indicating where to download the pretrained model weights. It is not for loading finetuned model checkpoints. You can view the source code for the build_model method at the link below:

cjm_yolox_pytorch/model.py - build_model

Checkout the following section from a follow-up tutorial to see the steps for initializing a YOLOX model with the saved checkpoint from a finetuning session:

Exporting YOLOX Models from PyTorch to ONNX - Loading the Checkpoint Data

cj-mills / christianjmills

posts/pytorch-train-object-detector-yolox-tutorial/ #41

Christian Mills - Training YOLOX Models for Real-Time Object Detection in Pytorch