This repository details the code required to replicate the results in the following three papers:
note: In order to replicate our findings one must first download the The UK COVID-19 Vocal Audio Dataset, pelase see below for details
note: as both the code to generate the SSAST experiments and perform openSMILE feature extration exist as a git submodule when cloning this repo, if you are intending on running the above analysis make sure to add the recursive submodule flag to the clone command:
git clone --recurse-submodules <repo.git>
Please note
This project is now concluded and this is the final repository.
You can ask a question in the issues but it may take a while for someone to answer you.
You can also email The Alan Turing Institute Health Team using this email address (healthprogramme@turing.ac.uk) and they will try to connect you to a researcher from this project.
All details of the final outputs from this project can be found below.
Data Paper --> notebook to produce summary statistics and plotly figures in UK COVID-19 Vocal Audio Dataset data descriptor.
SVM Baseline --> code used to generate the openSMILE-SVM baseline results along with weak-robust and nearest neighbour mapping ablation studies.
BNN Baseline --> code used to generate the ResNet-50 BNN baseline results and uncertainty metrics.
Code for plotting --> code used to generate the plots for the three papers.
Utilities --> helper functions + main dataset class for machine learning training.
Unit Tests --> unit tests for checking validitiy of train/val/test splits and other functionality.
Self Supervised Audio Spectrogram Transformer --> folder ssast_ciab/
is a git submodule pointer to a particular commit in the ssast transformer repo used to generate the main results of the study
Docker --> code used to create the docker image for the experimental environment (also contains the requirements.txt repo if a python virtual environment is preferred)
To make the replication of results easy we have provided a docker image of the experimental environment. To boot up a docker container run:
docker run -it --name <name_for_container> -v <location_of_git_repo>:/workspace/ --gpus=all --ipc=host harrycoppock/ciab:ciab_v4
This will open a new terminal inside the docker. Do not worry about having to download the docker image from the hub, the above command with handle this.
If you are on macOS please add the flag --platform=linux/amd64
The open access version of the UK COVID-19 Vocal Audio Dataset has been deposited in a Zenodo repository https://doi.org/10.5281/zenodo.10043977, and is available under an Open Government License (v3.0).
The full UK COVID-19 Vocal Audio Dataset is not publicly available as is classed as 'Special Category Personal Data'. Access may be requested from UKHSA (DataAccess@ukhsa.gov.uk), and will be granted subject to approval and a data sharing contract. To learn about how to apply for UKHSA data, visit: https://www.gov.uk/government/publications/accessing-ukhsa-protected-data/accessing-ukhsa-protected-data
The open access version of the dataset does not contain the 'sentence' modality, which has been removed, leaving behind the 'cough', 'three cough' and 'exahaltion' modalities. In addition, to meet open access requirements, some select attributes of the meta data have been aggregated (to prevent groups of individuals of smaller than 3 being singled out on selection of attributes). This means that the 'sentence' modality results are not replicable or the creation of the train-test splits. We note that this just applies for the the open access version of the data and that our full stack is replicable with the original dataset which can be accessed following the instructions above. We note that we provide the train-test splits in .csv form so that the machine learning experiments can be replicated with the open access data.
To easily run the code yourself using your own voice recordings, (no need to download the data), we have provided a short demo hosted on google colab. Please follow this link to have a go yourself!
Warning preprocessing and training take a considerable amount of time and require access to a V100 GPU or equivalent.
To replicate the SSAST results first the audio files need to be preprocessed:
cd ssast_ciab/src/finetune/ciab/
python prep_ciab.py
Once this is complete then training can begin:
sh run_ciab.sh
For more more detailed description please consult the BNN README.
Warning Please note that the full run is very compute intensive, and was performed on a K4 Tesla GPU/V100 GPU with at least 64 GB of system RAM. There are options to train on sub-samples of the dataset provided in the appropriate files. The code is configured with the config file in BNNBaseline/lib/config.py
.
To replicate BNN results, first cd BNNBaseline/lib
and extract features with:
python extract_feat.py
Once complete, train the model with
python train.py
To evaluate results and save to the folder specified in BNNBaseline/lib/config.py
, run
python evaluate.py
To run OpenSmile feature extraction first build the OpenSmile audio feature extraction package from source by following these instructions. Then run:
python SvmBaseline/opensmile_feat_extraction.py
This will extract opensmile features for the test and train sets in the s3 bucket. It will save them in features/opensmile/
To run SVM classificaiton on extracted features:
python SvmBaseline/svm.py
To run experiments please fill in the fields in ./dummy_config.yaml
To replicate the creation of the 3 training sets, 3 validation sets and 5 testing sets the following commands can be run:
The pipeline for generating splits is as follows:
cd utils
python dataset_stats.py --create_meta=yes
cd ..
cd utils
python dataset_stats.py --create_matched_validation=yes
cd ..
(creates the matched validation set)
There are no unit tests for this code base. Assert statements however feature throughout the codebase to test for expected functionality. There are a set of tests which should be run once train-test splits are created. This tests for over lapping splits, duplicate results and much more.
This repository details the code used to create the results presented in the following three papers. Please cite.
@article{coppock2024audio,
author = {Coppock, Harry and Nicholson, George and Kiskin, Ivan and Koutra, Vasiliki and Baker, Kieran and Budd, Jobie and Payne, Richard and Karoune, Emma and Hurley, David and Titcomb, Alexander and Egglestone, Sabrina and Cañadas, Ana Tendero and Butler, Lorraine and Jersakova, Radka and Mellor, Jonathon and Patel, Selina and Thornley, Tracey and Diggle, Peter and Richardson, Sylvia and Packham, Josef and Schuller, Björn W. and Pigoli, Davide and Gilmour, Steven and Roberts, Stephen and Holmes, Chris},
title = {Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers},
journal = {Nature Machine Intelligence},
year = {2024},
doi = {https://doi.org/10.1038/s42256-023-00773-8}
}
@article{budd2024,
author={Jobie Budd and Kieran Baker and Emma Karoune and Harry Coppock and Selina Patel and Ana Tendero Cañadas and Alexander Titcomb and Richard Payne and David Hurley and Sabrina Egglestone and Lorraine Butler and George Nicholson and Ivan Kiskin and Vasiliki Koutra and Radka Jersakova and Peter Diggle and Sylvia Richardson and Bjoern Schuller and Steven Gilmour and Davide Pigoli and Stephen Roberts and Josef Packham Tracey Thornley Chris Holmes},
title={A large-scale and PCR-referenced vocal audio dataset for COVID-19},
year={2024},
journal={Scientific Data},
doi = {https://doi.org/10.1038/s41597-024-03492-w}
}
@article{Pigoli2022,
author={Davide Pigoli and Kieran Baker and Jobie Budd and Lorraine Butler and Harry Coppock
and Sabrina Egglestone and Steven G.\ Gilmour and Chris Holmes and David Hurley and Radka Jersakova and Ivan Kiskin and Vasiliki Koutra and George Nicholson and Joe Packham and Selina Patel and Richard Payne and Stephen J.\ Roberts and Bj\"{o}rn W.\ Schuller and Ana Tendero-Ca$\tilde{n}$adas and Tracey Thornley and Alexander Titcomb},
title={Statistical Design and Analysis for Robust Machine Learning: A Case Study from Covid-19},
year={2022},
journal={arXiv},
doi = {10.48550/ARXIV.2212.08571}
}