Object detection is extensively used in today's world of AI. It is one of the key features of Autonomous Driving Systems. The current object detection models are highly dependent on centralized data to train. The raising concern with this is data privacy. The data privacy issue can be addressed with the Federated Learning(FL) model proposed in this paper. FL architecture aims at preserving data privacy and improving performance by training the model with decentralized data. Object detection models are trained locally at each node on their proprietary dataset, and the resultant weights are securely aggregated at the global server to yield an improved model. Further, we draw the comparison of object detection performance for an FL versus traditional deep learning approach.
The dataset used in this project is KITTI Dataset, a real-world computer vision benchmark designed specifically for autonomous driving.
The dataset consists of 7481 training images and 7518 test images. It consists of 8 classes: Car, Van, Truck, Pedestrian, Person sitting, Cyclist, Tram, and Misc. Additionally, few objects are not labelled, perhaps they were too far from the scanner. Hence, they are classified as DontCare.
To run YOLO model with Darknet weights, KITTI labels need to be converted to YOLO format. A text file is generated which contains ground truth of the image in the following format:
<object-class> <x> <y> <width> <height>
where x, y, width, and height are relative to the image's width and height.
<object-class> has the image path prepended to it.
Dataset is split into eight categories (DontCare excluded), ensuring that each image in a category contains at least one object belonging to that class. Further, these categories are distributed among four participating clients such that each client is deficit of a minimum of one class. Since KITTI dataset is imbalanced, in that 'Car' and 'Pedestrian' classes have more occurrences in comparison to other classes, all clients may perhaps have samples belonging to 'Car' and 'Pedestrian'.
Mean Average Precision (mAP) and Intersection over Union (IoU) are used as the evaluation metrics in this project. Intersection over Union is a ratio between the intersection and the union of the predicted boxes and the ground truth boxes. mAP compares the ground-truth bounding box to the detected box and returns a score. The higher the score, the more accurate the model is in its detections.
Client | Data Size | Epochs | Comm Rounds | mAP |
---|---|---|---|---|
C1 | 1615 | 5 | 15 | 68.5% |
C2 | 1725 | 5 | 1 | 66.9% |
C3 | 1662 | 5 | 15 | 64.7% |
C4 | 1632 | 5 | 15 | 64.4% |
Training Type | Data Size | Epochs | Total Loss | mAP |
---|---|---|---|---|
Deep Learning | 6481 | 5 | 21.43 | 44.5% |
Federated Learning (3 rounds) | 6481 | 5 | 26.26 | 46.1% |
Deep Learning | 6481 | 10 | 16.30 | 65.2% |
Federated Learning (3 rounds) | 6481 | 10 | 8.5 | 63% |
install dependencies and download pretrained weights
pip install -r requirements.txt
# yolov3
wget -P model_data https://pjreddie.com/media/files/yolov3.weights
perform data split
python dataset/data_split.py
# start server
python server_rec_weights.py
python client_send_weights.py
- perform detection
python detection_custom.py
## References