TLDR
Object Enumeration of minute and closely packed entitites using YOLOv3 & v4 using K means clustering for anchor box calculation on BCCD Datatset.Perfomance and the data modelling process and prospects of future work has been documented
BLOOD CELL DETECTION AND ENUMERATION
Abstract :
Blood Cell consists of RBCs, WBCs, Platelets and their respective subtypes, everytime doctors need a report, lab technicians take hours to count and estimate and the report takes a day or two to arrive in some cases. Recent developments in Object detection algorithms have paved the way to automate many computer vision tasks detection in particular such as vehicle detection, which would not have been possible with traditional convolutional networks. This concept could be used in biological discipline where automation is scarce and in dire need. We are planning to build a novel object detection architecture specific for identifying and accurately enumerating minute and densely packed entities on microscopic level.
Approach :
We started with the standard object detection algorithms such as Faster RCNN, YOLOv3, SSD. We understood their architecture and implemented them on autonomous driving datasets like KITTI, BDD100k and understood their pros and cons such as ease of preprocessing, accuracy, speed and complexity of the network. Our implementation on these datasets can be found here. We understood that our use case demands more accuracy and could trade off speed for better precision.
In the next step we decided to implement these models on the Blood Cell Dataset and train them from scratch and compare their performances. Availability of a relevant public dataset was scarce,there is only one relevant dataset BCCD and it is of substandard quality ( lower resolution and poor lighting). We take this challenge as an advantage since real world data in this domain is typically this away in several general hospitals and labs.
Dataset :
BCCD (Blood Cell Counting & Detection)
This is a dataset of blood cells photos, originally open sourced by cosmicad and akshaylambda. There are 360 images across three classes: WBC (white blood cells), RBC (red blood cells), and Platelets. There are 4888 labels across 3 classes. Annotations are given in xml file format but our object detection models especially YOLO family takes in text format annotations as a result a wide range of preprocessing was done to feed it into these networks.
Dataset Split :
Training Set = 300 Images Testing Set = 60 images
We noticed that there was a class imbalance. RBCs were much higher in numbers than the other two making it a more challenging task. We will discuss the difficulties faced in the forthcoming section.
After a brief discussion taking into account various pros and cons of these models we decided to start with YOLOv3 as it was a lighter model yet on par in accuracy with a heavy model which was a two stage model (Faster RCNN) and easy to change the parameters for our experiments.
Performance on YOLOv3 on BCCD:
Link to the dataset,trained weights,model :
https://drive.google.com/drive/folders/1zxXjTPeQWa1ijOdbhyXZZTgXot0jNqtc?usp=sharing
Pre Processing and training details:
With the above mentioned details the model was put to training on Google Colaboratory with GPU (Tesla K80) for a total of 6000 steps which took about 7 hrs.
Intermediate and final weights were saved to check the model performance.
A random test image was taken which the model had not seen before and each weights performance was noted. The model quickly learns to classify the components although it sometimes duplicates the counting but it gets better with steps as evident in the model predictions.
The model converges to global minima in about 6 hrs and learns to differentiate the three subtypes and detects with confidence. After every 1000 steps the saved weights scores were calculated on the test set and these are as follows:
class_id = 0, name = RBC, ap = 60.29% (TP = 443, FP = 157)
class_id = 1, name = WBC, ap = 100.00% (TP = 61, FP = 0)
class_id = 2, name = Platelets, ap = 64.46% (TP = 26, FP = 3)
for conf_thresh = 0.25, precision = 0.77, recall = 0.58, F1-score = 0.66
for conf_thresh = 0.25, TP = 530, FP = 160, FN = 378, average IoU = 63.75 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.749172, or 74.92 %
class_id = 0, name = RBC, ap = 87.39% (TP = 720, FP = 313)
class_id = 1, name = WBC, ap = 94.26% (TP = 59, FP = 3)
class_id = 2, name = Platelets, ap = 51.09% (TP = 39, FP = 40)
for conf_thresh = 0.25, precision = 0.70, recall = 0.90, F1-score = 0.79
for conf_thresh = 0.25, TP = 818, FP = 356, FN = 90, average IoU = 57.57 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.775787, or 77.58 %
class_id = 0, name = RBC, ap = 85.95% (TP = 698, FP = 268)
class_id = 1, name = WBC, ap = 69.59% (TP = 47, FP = 11)
class_id = 2, name = Platelets, ap = 88.64% (TP = 49, FP = 10)
for conf_thresh = 0.25, precision = 0.73, recall = 0.87, F1-score = 0.80
for conf_thresh = 0.25, TP = 794, FP = 289, FN = 114, average IoU = 55.15 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.813893, or 81.39 %
class_id = 0, name = RBC, ap = 81.47% (TP = 631, FP = 247)
class_id = 1, name = WBC, ap = 100.00% (TP = 61, FP = 0)
class_id = 2, name = Platelets, ap = 86.37% (TP = 40, FP = 6)
for conf_thresh = 0.25, precision = 0.74, recall = 0.81, F1-score = 0.77
for conf_thresh = 0.25, TP = 732, FP = 253, FN = 176, average IoU = 62.78 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.882776, or 88.28 %
class_id = 0, name = RBC, ap = 88.73% (TP = 727, FP = 288)
class_id = 1, name = WBC, ap = 100.00% (TP = 61, FP = 0)
class_id = 2, name = Platelets, ap = 92.44% (TP = 48, FP = 11)
for conf_thresh = 0.25, precision = 0.74, recall = 0.92, F1-score = 0.79
for conf_thresh = 0.25, TP = 836, FP = 299, FN = 72, average IoU = 62.64 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.890723, or 89.07 %
class_id = 0, name = RBC, ap = 86.42% (TP = 642, FP = 152)
class_id = 1, name = WBC, ap = 100.00% (TP = 61, FP = 0)
class_id = 2, name = Platelets, ap = 92.85% (TP = 47, FP = 4)
for conf_thresh = 0.25, precision = 0.83, recall = 0.83, F1-score = 0.80
for conf_thresh = 0.25, TP = 750, FP = 156, FN = 158, average IoU = 68.24 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.900900, or 90.09 %
Loss Function and its behaviour:
The model did not learn anything for a few hundred steps so it was struggling to find a local minima however after about 300 steps loss started going down monotonically upto 600 epochs. The model started to learn and the loss gradually decreased. As evident from the loss curve we can see momentum of 0.9 surely helps the loss function from fluctuating. It took about 7hrs to reach the global minima and the training was stopped at that point.
Performance on Out of Bag Data:
A google search gave an image of a blood smear. To make sure the model did not overfit to our training data or training data like conditions.
The model generalizes well although it misses few RBCs which are overlapping with other WBCs but gets all the WBCs accurately.
Fig : Randomly searched image
Seeing the performance of YOLOv3, which was good but not good enough to replace any lab technician made me push the limits further. Recent Introduction of YOLOv4 gave another opportunity to try how much a single stage detector can push.
Performance of YOLOv4 on BCCD
Preprocessing remains the same as for YOLOv3 except the few configuration file changes which are as follows:
YOLOv4 is comparatively a bulkier model (wrt YOLOv3) as a result the estimated training time was 28 hrs which is a reasonably long time for a small dataset like BCCD. However the model was trained for a limited time as Google colab has a fixed quota on GPU Usage and the results are as follows.
Fig : YOLOv4 on a Blood smear
YOLOv4 performs reasonably better than YOLOv3 and the model was not even trained until 6000 steps.It was trained only for 2000 steps yet it predicts almost all the RBCs and all the WBCs with much better confidence than the best of YOLOv3, on out of bag dataset which gives us a hope that this model if trained for longer and on much better datasets can definitely achieve near to human accuracy.
Fig : YOLOv4 performance on external image
Difficulties faced while training YOLOv4
YOLOv4 being bulkier than its predecessor takes much longer to train
Loss is very unstable and fluctuates every step, this can be because of the following reasons:
Learning rate could be too big.
Large network, small dataset.
The model might be overfitting on the batch every step,as a result when it sees a new batch every new timestep it fails to generalize well and the loss explodes.
Out of all the loss curves given below only one of them tends to go down gradually.(Fig c)
The other two(Fig a,b) perform poorly as seen from loss curves and on the performance on out of bag images (Fig d,e) .
Fig (a) Fig(b)
Loss curve with different initial anchor boxes and learning rates
Fig(c)
Fig(d)
Fig(e)
Future Works :
References:
In this research many papers were read and their ideas were taken and some of them were implemented and in future we hope to read and implement more such papers to come up with a more robust architecture for our use case.
7.[Automatic Detection and Quantification of WBCs and RBCs Using Iterative Structured Circle Detection Algorithm] (https://www.hindawi.com/journals/cmmm/2014/979302/)