"MODIPHY" is now available freely on IEEE Xplore, follow IEEE paper link for free early access.
We are thrilled to announce that our paper:
will be presented at the upcoming 2024 IEEE International Conference on Image Processing (ICIP 2024) in Abu Dhabi, UAE! π
β¨πβ¨πβ¨πβ¨πβ¨
We are beyond excited to share that our paper has been recognized as one of the top 5% of accepted papers at ICIP 2024! ππ
Stay tuned for more updates!
We developed βYOLO Phantomβ for detection in low-light conditions and occluded scenarios within resource-constrained IoT applications. We proposed the novel "Phantom Convolution," which enables YOLO Phantom to achieve comparable accuracy to YOLOv8n with a 43% reduction in parameters and size, resulting in a 19% reduction in GFLOPs. By employing transfer learning on our multimodal dataset, the model demonstrates enhanced vision capabilities in adverse conditions. Our Raspberry Pi IoT platform equipped with noIR cameras and integration with AWS IoT Core and SNS showcases a substantial 17% and 14% boost in frames per second for thermal and RGB data detection, respectively, compared to the baseline YOLOv8n model.
To know more about the MODIPHY please refer to the preprint available in arXiv
Please refer to yolo_phantom for the implementation.
Download the multimodal dataset
1. Install the Ultralytics library in a Conda or virtual environment.
2. Once in the environment, verify the installation location of the Ultralytics library using pip list
.
/site-packages
, and there should be an ultralytics
folder inside it.3. Clone my repository to your computer.
4. Navigate to the yolo_phantom folder and copy the cfg
and nn folders.
5. Paste the copied cfg
and nn
folders into the Ultralytics folder mentioned in step 2.
6. Verify that the YOLO Phantom model and weights are functioning correctly by using the yolophantom_testing file.
I utilized the latest version of YOLO, V8, developed by Ultralytics, to construct an object detection demonstration. I used the Ultralytics library to enable detection and tracking tasks from a live camera feed.
I utilized my OnePlus 6 Android phone as a live camera device to facilitate a rapid and simplistic implementation. I used a free elementary application called "Camo Studio", which I installed on both my phone and M1 MacBook Air. This application permits my phone's camera to function as an external webcam for my laptop. The primary objective of this project was to simulate a surveillance camera scenario for detecting both vehicles and pedestrians.
While experimenting with various pre-trained YOLOv8 models, I discovered that the YOLOv8n variant was efficient enough to detect and track fast-moving traffic even in low-light and nocturnal conditions.
For reference, here are the relevant resources I employed for Ultralytics and Camo:
https://github.com/ultralytics/ultralytics
I used a CanaKit based raspberry pi version 4, model B (CanaKit extreme, 128 Gb, 8Gb, BullsEye OS) and a USB camera for object detection this time. A few of the interesting features were tested this time:
A few of the learnings are:
I recently used my Raspberry Pi to develop a low-light detection and notification system, using AWS. Thanks to my newly trained multimodal YOLO variants (trained on RGB and thermal image fusion data), I was able to achieve decent detection capability even in poor lighting conditions, regardless of the time of day. Here are some of the key learnings from this exercise:
If you are wondering why the images here look a bit meh, remember the camera was behind double-glazing glass and a net, thereafter the photo of live detection was taken from that monitor. For reference, an out-of-sample RGB detection example by fusion data trained multimodal YOLOV8X model in my Mac is also shown.
For implementation please refer: https://github.com/shubha07m/On-device-computer-vision-experiments-with-IoT/tree/main/from_pi
If you like our work, please consider citing it as follows:
@INPROCEEDINGS{10648081,
author={Mukherjee, Shubhabrata and Beard, Cory and Li, Zhu},
booktitle={2024 IEEE International Conference on Image Processing (ICIP)},
title={MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO},
year={2024},
volume={},
number={},
pages={2613-2619},
keywords={YOLO;Accuracy;Convolution;Computational modeling;Transfer learning;Phantoms;Real-time systems;Low light object detection;Multimodal fusion;IoT;YOLO;Phantom Convolution},
doi={10.1109/ICIP51287.2024.10648081}}
## Previous work ##
In my previous research, UNIMODAL was developed for search and disaster recovery using infrared images and tested with older YOLO versions (V3, V4, and V7). My paper on the same can be accessed here: https://ieeexplore.ieee.org/document/10025436
This paper can be cited as below:
@INPROCEEDINGS{10025436,
author={Mukherjee, Shubhabrata and Coudert, Oliver and Beard, Cory},
booktitle={2022 IEEE International Symposium on Technologies for Homeland Security (HST)},
title={UNIMODAL: UAV-Aided Infrared Imaging Based Object Detection and Localization for Search and Disaster Recovery},
year={2022},
volume={},
number={},
pages={1-6},
keywords={Location awareness;Training;5G mobile communication;Transfer learning;Object detection;Infrared imaging;US Department of Homeland Security;UAV aided disaster recovery;YOLO based infrared object detection;YOLOV7-official;Autonomous Vehicular Network operations},
doi={10.1109/HST56032.2022.10025436}}
## Acknowledgement ##
YOLO Phantom code base is built with ![ultralytics](https://github.com/ultralytics/ultralytics) and many of the base modules of YOLOV8 were used.
Thanks for the great implementations!