This repo features our initial work using OpenCV, TensorFlow, and PyTorch to train three convolutional neural networks (CNNs) for human activity recognition. These files include modifications of the base Conv2D code provided by the Introduction to Video Classification & Human Activity Recognition tutorial, modifications to a pretrained S3D model, and a Conv3D model built from scratch by @nehabaddam.
1. Create a virtual environment
It is recommended to create a new virtual environment for this project. Use venv
or conda
to create a virtual environment and install the dependencies. If using a Mac, jump to this section.
2. Install dependencies from requirements.txt
(optional)
pip install -r requirements.txt
The tensorflow-metal
plugin will enable the GPU on Macs fitted with Apple silicon or AMD procesors, which radically improves model training time. More info is available here.
1. Deactivate current venv / conda environments
# venv
deactivate
# conda
conda deactivate
2. Create new venv
# Recommended to use python <= 3.12
python3.12 -m venv ~/venv-metal
# or
python3 -m venv ~/venv-metal
3. Activate venv-metal
source ~/venv-metal/bin/activate
4. Install tensorflow-metal
python -m pip install -U pip
python -m pip install tensorflow-metal
5. Install TensorFlow and OpenCV
python -m pip install tensorflow
pip3 install opencv-python
Videos used for this project are not included here due to storage and PII reasons, so these files will need to be added manually. If you are a contributor to this project, contact @jamescoledesign for access to the dataset. This format should work for any video dataset.
After you download the dataset, create a folder named downloads
at the root of your local clone of this repo and place the train
and test
folders within the downloads
folder like the example below. Be sure to rename feature
in the example below with the label you intend to use for that category of videos (e.g., Sleeping).
root
└───downloads
│ └───test
│ │ └─feature
│ │ │ video1.mp4
│ │ │ video2.mp4
│ │ │ ...
│ │
│ └───train
│ │ └─feature
│ │ │ video3.mp4
│ │ │ video4.mp4
│ │ │ ...
Ensure the videos organized in the format described in the File structure section above.
python conv2d_train.py
python conv3d_train.py
Navigate to /notebooks/pre_trained.jpynb
and run the code in the Jupyter Notebook.
Ensure the videos organized in the format described in the File structure section, then run the command below and follow the prompts.
python conv2d_test.py
This model is too large to store on GitHub, but you can download the model here and place it in ./conv3D/2024-09-22-13-18-18-conv3d-model.keras
.
Next, ensure the videos organized in the format described in the File structure section, run the command below, and follow the prompts.
python conv3d_test.py
Ensure the videos organized in the format described in the File structure section. Navigate to /notebooks/s3d_v1.ipynb
and run the code in the Jupyter Notebook.
The demo can be run from the root directory using python demo.py
and following the prompts. Predictions can be made using either prerecorded video or live webcam video.
Predictions for each class:
+------------------------+---------------+
| Prediction | Probability |
+========================+===============+
| Sitting In Wheelchair | 0.81 |
| Eating | 0.19 |
| Watching TV | 0 |
| Asleep Trying to sleep | 0 |
| Lying In Bed | 0 |
| Therapy | 0 |
| Transfer To Bed | 0 |
| Family | 0 |
| Nurse Visit | 0 |
| Talking on the Phone | 0 |
| EVS Visit | 0 |
| Doctor Visit | 0 |
+------------------------+---------------+
Predicting live video from a webcam attached to a Raspberry Pi 5.
Important Note: The prediction in the image above is incorrect, and these models need to be refined further before they are used in this way. This image merely demonstrates how predictions can be visualized in real time.
If TensorFlow does not recognize your GPU, try uninstalling existing versions of tensorflow and reinstalling it using this command
python -m pip install "tensorflow==2.10" --user
Then check to see if the GPU is recognized
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
# Should see something like the message below:
# [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
In a Jupyter Notebook or Python file, you can simply run the following command to see if the GPU is enabled:
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
# Num GPUs Available: 1
The file s3d_v1.ipynb
uses OpenCV features with dependencies that may not be automatically installed (e.g., GStreamer). To clear errors, OpenCV may need to be built from source. A few options are below.
# Install minimal prerequisites (Ubuntu 18.04 as reference)
sudo apt update && sudo apt install -y cmake g++ wget unzip
# Download and unpack sources
wget -O opencv.zip https://github.com/opencv/opencv/archive/4.x.zip
wget -O opencv_contrib.zip https://github.com/opencv/opencv_contrib/archive/4.x.zip
unzip opencv.zip
unzip opencv_contrib.zip
# Create build directory and switch into it
mkdir -p build && cd build
# Configure
cmake -DOPENCV_EXTRA_MODULES_PATH=../opencv_contrib-4.x/modules ../opencv-4.x
# Build
cmake --build .
git clone --recursive https://github.com/skvark/opencv-python.git
cd opencv-python
export CMAKE_ARGS="-DWITH_GSTREAMER=ON"
pip install --upgrade pip wheel
# this is the build step - the repo estimates it can take from 5
# mins to > 2 hrs depending on your computer hardware
pip wheel . --verbose
pip install opencv_python*.whl
@jamescoledesign, @nehabaddam, @Phuong4587, @EmillyH555, @LeifMessinger, @Sreeyuktha-1234