ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.82k stars 16.37k forks source link

Pseudo-labeling with Yolov5 (similar to darknet) for Active Learning #404

Closed marvision-ai closed 4 years ago

marvision-ai commented 4 years ago

🚀 Feature

It would be super effective to be able to process a list of images in a folder and save results of detection in Yolo training format for each image as label .txt (in this way you can increase the amount of training data and automate the annotation process)

Motivation + Pitch

I want to be able to do active learning with my models. As new data comes in, the model can annotate the images, add them into the original database and retrain.

glenn-jocher commented 4 years ago

@marvision-ai to label: python detect.py --savet-txt

marvision-ai commented 4 years ago

@glenn-jocher Oh wow. Closing this. Thanks for the information!

glenn-jocher commented 4 years ago

You're welcome! You'll see this:

Screen Shot 2020-07-14 at 12 58 12 PM
glenn-jocher commented 4 years ago

@marvision-ai this same pseudolabeling is allowing YOLOv5 to top the leaderboard in the Kaggle wheat competition BTW apparently :)

https://www.kaggle.com/nvnnghia/yolov5-pseudo-labeling

marvision-ai commented 4 years ago

@glenn-jocher Is there anything they are doing in those kernels that your repo doesn't already do?

glenn-jocher commented 4 years ago

@marvision-ai I don't know, I haven't been involved at all other than with some licensing issues. The repo supports a few tools out of the box that may be useful in competitions however, such as built in model ensembling and test time augmentation. You can see these in the wiki or the tutorial section of the readme. https://github.com/ultralytics/yolov5/wiki

marvision-ai commented 4 years ago

@glenn-jocher Oh okay, makes sense. Thanks for the information and the fantastic repo!

glenn-jocher commented 4 years ago

@marvision-ai maybe I should add a pseudo-labelling tutorial as well, since it's definitely an interesting item. scale.ai inflated themselves into a billion dollar valuation on the back of it, so offering it here open-source for everyone to use seems like a great idea. How are you using it?

glenn-jocher commented 4 years ago

If I understand the kaggle wheat approach, the steps were:

  1. Train the best possible model on your train set (i.e. YOLOv5x at a high resolution).
  2. AutoLabel (as I call it) the validation set used for scoring (with TTA enabled I assume).
  3. Fine-tune your trained model on AutoLabeled val set (maybe combined with train set) for 10 epochs.
marvision-ai commented 4 years ago

@marvision-ai maybe I should add a pseudo-labelling tutorial as well, since it's definitely an interesting item. scale.ai inflated themselves into a billion dollar valuation on the back of it, so offering it here open-source for everyone to use seems like a great idea. How are you using it?

I train large models that have super high accuracy on a small dataset, and then use them to label large datasets to add to my base dataset. (rinse and repeat) I also then augment them using a wide array of augmentations before training a new network. This process then repeats. And over time this produces models that have far greater generalization.

marvision-ai commented 4 years ago

If I understand the kaggle wheat approach, the steps were:

1. Train the best possible model on your train set (i.e. YOLOv5x at a high resolution).

2. AutoLabel (as I call it) the validation set used for scoring (with TTA enabled I assume).

3. Fine-tune your trained model on AutoLabeled val set (maybe combined with train set) for 10 epochs.
  1. To train this best model are you just using: python train.py --img {biggest resolution} --batch {largest batch} --data {dataset.yaml} --cfg ./models/yolov5x.yaml --weights '' ? Will this use a yolov5x model and train it from scratch?

  2. Yes.

  3. Perhaps you can add into extra augmented images?

glenn-jocher commented 4 years ago

@marvision-ai yes command looks fine. You can always add more images to the dataset. YOLOv5 augments automatically during training, you can adjust the augmentation hyperparameters in train.py. See the notebook for train*.jpg, which shows an augmented batch.

marvision-ai commented 4 years ago

@marvision-ai maybe I should add a pseudo-labelling tutorial as well, since it's definitely an interesting item. scale.ai inflated themselves into a billion dollar valuation on the back of it, so offering it here open-source for everyone to use seems like a great idea. How are you using it?

Ultimately, this whole technique of training SOTA networks is a little more advanced but a tutorial for all others would really help them. I think this is what is setting your repo apart from the others and will continue to do so ==> Network training + detection + huge utility functions/algorithms AND TUTORIALS

marvision-ai commented 4 years ago

@marvision-ai yes command looks fine. You can always add more images to the dataset. YOLOv5 augments automatically during training, you can adjust the augmentation hyperparameters in train.py. See the notebook for train*.jpg, which shows an augmented batch.

Quick questions on this:

  1. For instance, 'degrees': 0.0, # image rotation (+/- deg) . If i set this to 90.0, does it randomly rotate between 0->90?
  2. Does --evolve work with these?
  3. Is there a way I can automate this where: I set up multiple hyperparameter configs --> train a model for each config --> pick best model based on optimal hyperparameters? or is this what --evolve ultimately does?
glenn-jocher commented 4 years ago

@marvision-ai I would recommend you just change these and observe the effects directly in train*.jpg.

--evolve applies a genetic evolution algorithm (same one we use for AutoAnchor) to the training hyps. A recent PR broke this functionality however, so for now you need to manually tune.

In any case --evolve ideally would use several hundred trainings to arrive at a minima, so it is not something you take on lightly.

glenn-jocher commented 4 years ago

https://github.com/ultralytics/yolov3/issues/392

glenn-jocher commented 4 years ago

@marvision-ai jesus, now that I think about it I probably need to create an --evolve tutorial also...

But my #1 piece of advice here is to not get ahead of yourself. Everyone seems to want to overoptimize and second guess everything before they've even started. Before you do anything at all, you need to simply train normally using all default settings, both from scratch and from pretrained weights, and then (only once you have your baseline results in hand) sit down and consider your next steps.

marvision-ai commented 4 years ago

@glenn-jocher absolutely! I agree. I ask these questions to get a better understanding of workflow. I have a large dataset that I'm training on. I will update you on results. I look forward to experimenting with the hyperparameters and active learning!

glenn-jocher commented 4 years ago

@marvision-ai great! When you say active learning, this is what you are calling the iterative process (train, pseudolabel, repeat) you described before? I hadn't heard the term before.

marvision-ai commented 4 years ago

@glenn-jocher correct. It's what I described and also what you described in your bullet points.

When I slowly increase the dataset size with psuedo labeling and go through the annotations I get a feel for what my network is learning or struggling on. Then i can introduce more samples into the main dataset that will help it generalize better. This automates the process of annotation and usually yeilds much better results.

Ive noticed that purely more data != More accuracy. Therefore I'm always actively benchmarking how my model learns and if there is class bias in the works.

You can read a nice summary on this here : https://jacobgil.github.io/deeplearning/activelearning

There are two main approaches that most of the active learning works follow. Sometimes they are a combination of the two.

Uncertainty sampling: Try to find images the model isn’t certain about, as a proxy of the model being wrong about the image. Diversity sampling: Try to find images that represent the diversity existing in the images that weren’t annotated yet.

A function that gets an image and returns a ranking score, is often called an “acquisition function”.

Adding in this acquisition function into the test.py would be amazing. That would basically allow users to know where the model is failing and on what images to then automate the active learning process.

Sorry for the wall of text. 😅

glenn-jocher commented 4 years ago

@marvision-ai oh interesting. We already have an image weighting function actually, it weighs images more heavily if they are full of low-mAP objects. It can be used for short term mAP gains during training, but it tends to overtrain faster too unfortunately, resulting in lower final mAP. https://github.com/ultralytics/yolov5/blob/611ec44359432b3a3ac510e3d407dcae1bb1b7af/train.py#L212-L217

You can see mAP per class BTW with python test.py --verbose. Wouldn't you simply want to add images from the lowest mAP classes? i.e. go scrape images of cars and bicycles after seeing the results below?

Namespace(augment=False, batch_size=32, conf_thres=0.001, data='data/coco128.yaml', device='', img_size=640, iou_thres=0.65, merge=False, save_json=False, single_cls=False, task='val', verbose=True, weights='yolov5s.pt')
Using CUDA device0 _CudaDeviceProperties(name='Tesla P100-PCIE-16GB', total_memory=16280MB)

Fusing layers... Model Summary: 140 layers, 7.45958e+06 parameters, 7.45958e+06 gradients
Scanning images: 100% 128/128 [00:00<00:00, 3302.66it/s]
Scanning labels ../coco128/labels/train2017.cache (126 found, 0 missing, 2 empty, 0 duplicate, for 128 images): 100% 128/128 [00:00<00:00, 19108.45it/s]
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95: 100% 4/4 [00:04<00:00,  1.03s/it]
                 all         128         929       0.386       0.752       0.697       0.455
              person         128         254       0.404       0.854       0.787       0.502
             bicycle         128           6        0.44       0.833       0.693       0.302
                 car         128          46       0.342       0.452       0.422       0.218
          motorcycle         128           5       0.428         0.8       0.832       0.634
            airplane         128           6       0.669           1       0.972       0.617
...
Speed: 3.9/1.9/5.8 ms inference/NMS/total per 640x640 image at batch-size 32
marvision-ai commented 4 years ago

@glenn-jocher yes this is halfway there. From the test I know where I'm failing but it doesn't really tell me the exact instances of why the cars and bikes are getting lower map or how off the detections are unless I go through all the images one by one.

I basically do this all in my head as I'm pseudo labeling/ testing on images to add or remove from the base dataset.

Therefore the process is as follows:

  1. Calculate map for individual classes
  2. Pseudo label more data for particular classes that need help. (Understand where it's failing before retraining.)
  3. Retain and then verify if overall map has increased. (Basically like unit testing for the networks accuracy to make sure introduction of new images didn't reduce accuracy in other classes. )

To summarize: the code is almost there. Perhaps test.py could have the optional functionality to save images that have proven to be the hardest for the network? Either based on incorrect detections or failing to detect vs. ground truth?

I'm just brainstorming here... I just know as a practitioner this would stream line this process immensely since I do this all manually ATM and it can get cumbersome if the dataset is 1000's of images large.

Again, this is not mandatory but this is something that happens in the industry when it comes to productionizing models as you probably know.

glenn-jocher commented 4 years ago

Sure. You can get metrics per image easily from the existing code. If you debug test.py with coco128, stats going into this operation is a list 128 long: https://github.com/ultralytics/yolov5/blob/611ec44359432b3a3ac510e3d407dcae1bb1b7af/test.py#L167-L175

So to get these metrics per image you would just do something like:

for image in stats:
    si = [np.concatenate(x, 0) for x in zip(image)]  # to numpy
    if len(si):
        p, r, ap, f1, ap_class = ap_per_class(*si)
        p, r, ap50, ap = p[:, 0], r[:, 0], ap[:, 0], ap.mean(1)  # [P, R, AP@0.5, AP@0.5:0.95]
        mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean()
        nt = np.bincount(stats[3].astype(np.int64), minlength=nc)  # number of targets per class
    else:
        nt = torch.zeros(1)
marvision-ai commented 4 years ago

@glenn-jocher Ah interesting... I may fool around with that then and perhaps try to get it working. If I have the time, I may add in some code to have it save the images that were the hardest or do something with them. When I get that working, I can provide a pull request.

ZeKunZhang1998 commented 4 years ago

If I understand the kaggle wheat approach, the steps were:

  1. Train the best possible model on your train set (i.e. YOLOv5x at a high resolution).
  2. AutoLabel (as I call it) the validation set used for scoring (with TTA enabled I assume).
  3. Fine-tune your trained model on AutoLabeled val set (maybe combined with train set) for 10 epochs.

Hi,will it fine-tune on the AutoLabelded or AutoLabelded+trainset?

glenn-jocher commented 4 years ago

Don't know. You could try it both ways.

ZeKunZhang1998 commented 4 years ago

thanks,will the image without boxes,the empty txt file in training?

---Original--- From: "Glenn Jocher"<notifications@github.com> Date: Thu, Aug 27, 2020 08:36 AM To: "ultralytics/yolov5"<yolov5@noreply.github.com>; Cc: "Comment"<comment@noreply.github.com>;"ZhangZekun"<836963662@qq.com>; Subject: Re: [ultralytics/yolov5] Pseudo-labeling with Yolov5 (similar to darknet) for Active Learning (#404)

Don't know. You could try it both ways.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

glenn-jocher commented 4 years ago

@ZeKunZhang1998 for images with no labels, you do not need to supply a txt file.

ZeKunZhang1998 commented 4 years ago

oh, It will train as that have boxes,right?

---Original--- From: "Glenn Jocher"<notifications@github.com> Date: Thu, Aug 27, 2020 10:23 AM To: "ultralytics/yolov5"<yolov5@noreply.github.com>; Cc: "Mention"<mention@noreply.github.com>;"ZhangZekun"<836963662@qq.com>; Subject: Re: [ultralytics/yolov5] Pseudo-labeling with Yolov5 (similar to darknet) for Active Learning (#404)

@ZeKunZhang1998 for images with no labels, you do not need to supply a txt file.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

glenn-jocher commented 4 years ago

I don't understand your question.

ZeKunZhang1998 commented 4 years ago

thanks,I know

---Original--- From: "Glenn Jocher"<notifications@github.com> Date: Thu, Aug 27, 2020 10:36 AM To: "ultralytics/yolov5"<yolov5@noreply.github.com>; Cc: "Mention"<mention@noreply.github.com>;"ZhangZekun"<836963662@qq.com>; Subject: Re: [ultralytics/yolov5] Pseudo-labeling with Yolov5 (similar to darknet) for Active Learning (#404)

I don't understand your question.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

karndeepsingh commented 3 years ago

@marvision-ai Hey Brother, I want to apply Active Learning to my dataset. I can see you have done this already using YOLOV, can please help me with the things to setup and to use YOLOV5 for Active Learning.

Thanks