Each directory is named by the datetime stamp from collection in the format "%Y%m%d%H%M%S%f.
Each directory contains:
A ~1 minute long video file of the subject's finger with flash on
mp4 format
30 frames per second
No audio
device.json, which contains information about the pulse oximeter device used to measure the ground truth values for SpO2 and heart rate.
gt.json, which contains the ground truth values for SpO2 and heart rate (HR).
phone.json, which contains the make and model of the phone used to capture the video.
user.json, which contains the subject identifier for the patient (but which is likely to hold information about the age, sex, weight, etc. of the patient in future).
Additionally, the sample_data/ directory contains:
all_label_data.csv, which contains all the information from all the JSON files in a single CSV file.
This readme.
The associated README provides examples of each of the 4 JSON files, as well as the all_label_data CSV file.
Large file storage
Each mp4 video file is approximately 1.3 MB, which could result in bloating of the repository. However, these files are almost certainly not going to have their contents changed. This allows us to make use of Git Large File Storage (LFS) for tracking the mp4 files in the sample_data/ subdirectories. Full details for this can be found in the README pushed with this PR.
This PR adds 11 sample data entries that I constructed using a Beurer PO 30 pulse oximeter and an iPhone 8 on 3 subjects.
The README supplied in this PR describes the dataset and implementation in detail.
Format and structure
The data format and structure are an amalgamation of those found in this outline and the methodology of Nemcova et al.
"%Y%m%d%H%M%S%f
.device.json
, which contains information about the pulse oximeter device used to measure the ground truth values for SpO2 and heart rate.gt.json
, which contains the ground truth values for SpO2 and heart rate (HR).phone.json
, which contains the make and model of the phone used to capture the video.user.json
, which contains the subject identifier for the patient (but which is likely to hold information about the age, sex, weight, etc. of the patient in future).Additionally, the
sample_data/
directory contains:all_label_data.csv
, which contains all the information from all the JSON files in a single CSV file.The associated README provides examples of each of the 4 JSON files, as well as the
all_label_data
CSV file.Large file storage
Each mp4 video file is approximately 1.3 MB, which could result in bloating of the repository. However, these files are almost certainly not going to have their contents changed. This allows us to make use of Git Large File Storage (LFS) for tracking the mp4 files in the
sample_data/
subdirectories. Full details for this can be found in the README pushed with this PR.