This repository contains the work and information related to the Thermal Imaging project requested by Ed Wellman and Brad Ross from the University of Arizona. The goal of my involvement in the project was to write code to do rudimentary computer vision on the large collection of videos that had been taken from different open mines in North America. The code written is intended to run in HPC environments, and this document will walk the reader through the process of initiating a processing session. This document will also serve as a navigation aide for finding previous research done to support the computer vision aspects of this project.
The steps for this are covered in the latest Zoom recording. What you need to do is sign into https://www.globus.org/ and use the https://app.globus.org/file-manager file manager to initiate transfer between the machine that has the external hard drive plugged in and the UA HPC FileSystems. Ensure that globus is running on the desktop machine before using the file browser.
All the code necessary to process the videos is kept in version control on github. The repo that hosts this readme also comes with scripts for starting super computer jobs and python scripts to person computer vision tasks on mine videos. In order to get the code you must clone the repository in an interactive session on the HPC.
Navigate to ood.hpc.arizona.edu and fill in the web auth details. You will then see
click on the clusters
tab at the top and select shell. This brings you to a login node on the HPC using the browser for your shell session. Then we will submit the lines of code shown in the following image:
Note there will be several differences in what you type. You will not be using /xdisk/chrisreidy/baylyd
, instead something like /xdisk/bjr/
. Second your command would be interactive -a bjr
not interactive -a visteam
because bjr is the allocation for this project on the HPC. Also you won't encounter the error shown here because there won't be an existing directory called thermal_imaging
.
Here are the individual commands in a list to follow
elgato
interactive -a bjr
cd /xdisk/bjr
git clone https://github.com/DevinBayly/thermal_imaging.git
You should now have a folder called thermal_imaging
at this absolute path /xdisk/bjr/thermal_imaging
. This will be where you complete the next step as well.
The singularity container is currently hosted by github at https://github.com/DevinBayly?tab=packages, but you must use singularity to pull it down to the HPC. This is a step that you will only have to do once to get a copy of the container on the HPC. First ensure you are in /xdisk/bjr/thermal_imaging/
then run these commands.
singularity pull oras://ghcr.io/devinbayly/thermal.sif:numpy
mv thermal.sif_latest.sif thermal_imaging.sif
This will make sure you have the singularity container in the correct folder for being used in the batch processing to come.
There's only a couple more steps at this point. Using either the OOD cluster shell access or an actual terminal prompt signed in via ssh
we will now be running the actual video processing. We also want to make sure that we are submitting to the elgato super computer so we have to type elgato
when we get into the shell. You will know that this is correct when the command prompt says (elgato)
to the far left of a new line. Then you need to ensure you are in the correct directory. You need to be in the directory that has the bash scripts with .sh
extensions, the two python scripts classes.py
and process_video_folder.py
and the thermal_imaging.sif
file. It should look like this
Then you can run these commands
in order, recall that after sbatch_array.sh
comes the path to the videos upload folder, then the path to the output log folder. These need to be updated for your own processing paths. $(pwd)
just expands to the "present working directory" absolute path, so if you are like me and have your log folder in the same place as the code and singularity container you can save some typing and put $(pwd)
in front of the name of the folder.
sbatch sbatch_array.sh /xdisk/chrisreidy/baylyd/thermal_imaging/Mine-4/ $(pwd)/logfiles_folder/
Once you've done that you can run the command to double check your job was submitted correctly you can use this. put in your netid where it says
squeue -u <netid>
Or you can sign in here to view the status of your submitted jobs https://portal.hpc.arizona.edu/
The resulting log follows this structure where data contains actual data related to detections and background_images are base64 images used for contextualizing data in the background of the mine video they were taken from.
A data entry is from a single frame of processing, and has information about the number of detections during that frame, and the detections locations in x,y. These coordinates are in the space of the image width so x is within [0,width]
and y is [0,height]
but where 0 is at the top of the image. This is a standard graphics system dating back to the early monitors whose rays started tracing from top left to bottom right.
This is the file that controls how many array jobs are created. Right now the file is producing 5 jobs at a time because it contains this line #SBATCH --array 0-5
. If you open the file and change the 5 to a large number you will run more array jobs. This is the main modification that you would do in the future, as all the other settings were chosen for specific reasons to suit this processing pipeline.
If you start running on other super computers you will be able to run more than 16 cpu at a time. So you would have to say puma
or ocelote
when you first log into the HPC shell. Then you would need to change the line in the sbatch_array.sh from #SBATCH --ntasks=16
to something higher. You would need to also update that number in the line num_cpus = 15
withing the process_parallel_video
function in the classes.py
file.
When a video is all processed a .json
file will be produced in the output log folder. The logs will have the same inner name as the video that they were created from in processing. Simply download a log file from the OOD xdisk file browser
and then navigate to the logger visualization tool https://devinbayly.github.io/thermal_imaging/.
Please refer to this video for information about how to interact with the log plotter
This repo has code meant to help transform the research data from a computer vision task to a data science task. Many things can be done with the log files to help speed up the process of detecting true rockfall events as compared with video artifact detections, vehicle motion, or any other false signals. The Log Plotter is also an example demonstration of but one way to visualize the results from the log files.