The HALLO (Humans and ALgorithms Listening for Orcas) annotation tool is a web application for analyzing and annotating audio files, consisting of three main modules: front-end(user interface), back-end(data processing) and database. It is licensed under the GNU GPLv3 licens and hence freely available for anyone to use and modify (including for commercial purposes) as long as the same license is kept. The tool was designed to facilitate the inteaction between machine learning developers and expert bioacousticians working to create automated detectors and classifiers for orcas (Orcinus orca). However, we believe it to be flexible enough to be used or modified for many other projects.
The main functions are:
Managing raw audio files, segmenting and compressing them in the back-end and generating the corresponding spectrograms.
Users can register as two roles, Model developer and Annotator. Model developers can create batches containing audio segments, which can then be assigned to one or more Annotators, or import annotations created by machine and assign them to annotators. Annotators can wrok on the spectrogram of the audio segments, review the corresponding annotations, create new annotations, and listen to the audio.
The generated annotations can be exported to csv format for further analysis or use in machine learning development.
Hallo application tools use a modular design, with two separate applications for the front-end and back-end, communicating data through the Restful API protocol.
The backend is implemented using the Django framework, and Ketos is installed as the core component for processing audio, providing a set of standard APIs for the frontend to consume.
The front-end is built using the React framework and runs in most modern browsers. A clean and easy-to-use design is used, focusing on synchronizing, filtering and quickly finding data.
The HALLO annotation tool is composed of two parts: front-end and back-end. The back-end needs to be installed on a server that can read data directly, and the front-end can be installed on any server.
Two methods of installation are described below, one for installing locally (for use in one single computer, for example), and the other is installed on a remote server for multiple users to access and use. Both ways use docker as a carrier to install.
Users can also adapt the code to their needs and deploy it according to their own installation environment.
In the example of a local installation, the HALLO annotation tool runs in a few Docker containers and requires Docker and Docker-compose to be pre-installed on the system.
The HALLO annotation tool needs to be installed in the same directory as the audio files, and the audio files need to be contained in a folder called audio.
Clone this repo and put it in the same folder where the audio folder located.
A typical directory structure is:
Example_Project_folder
├─ audio
└─ hallo_annotation
In the code base, the file docker-compose.dev.yml is used to provide a basic example. To use this file you need to create an env (eg .evn.dev) to configure some environment parameters.
An example of environmental parameters
# Database
POSTGRES_DB=name_of_the_database
POSTGRES_USER=username
POSTGRES_PASSWORD=password
# PGadmin
PGADMIN_DEFAULT_EMAIL=login_email_address
PGADMIN_DEFAULT_PASSWORD=password
# Django key
DJANGO_SECRET_KEY=The_django_secret_key
From the hallo_annotation folder, run docker command as below. If you are using linux, you might need to add sudo
.
docker-compose -f docker-compose.dev.yml --env-file .env.dev up -d
If all goes well, you can use docker ps
to check the containers' status.
hallo_annotation % docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
55052170d523 hallo_annotation_frontend "docker-entrypoint.s…" 34 minutes ago Up 33 minutes 0.0.0.0:3000->3000/tcp hallo_frontend
7337d62ba365 hallo_annotation_backend "bash -c 'python man…" 17 hours ago Up 33 minutes 0.0.0.0:8000->8000/tcp hallo_backend
416cfb0e2dd0 dpage/pgadmin4 "/entrypoint.sh" 17 hours ago Up 33 minutes 443/tcp, 0.0.0.0:5050->80/tcp hallo_pgadmin
ec8b3350a851 postgres:13 "docker-entrypoint.s…" 17 hours ago Up 33 minutes 0.0.0.0:5432->5432/tcp hallo_postgres_db
A superuser needs to be created at the backend to set up the groups. In the same folder, use this command to get in the backend container.
docker-compose -f docker-compose.dev.yml --env-file .env.dev run backend dash
Then create the superuser for the Django Admin:
python manage.py createsuperuser
The HALLO annotation tool uses Django's Admin panel to manage user permissions. For the first time, you need to log in and create groups and assign permissions for each group. The Django backend is at: http://localhost:8000/admin/, and you can use the superuser account you just created to log in.
After logging in, create three user groups in the Groups page (Case sensitive).
1. Admin
2. Model Developer
3. Annotator
New users will need to be manually added to a group here after they have completed registration. Permission control here only affects the operation permissions of the admin interface, and does not affect the permissions when using the HALLO annotation software. After users are added to the appropriate group, they will enter the applicable user interface once logging into the software, and the group is only used here to differentiate user roles.
Permissions for different user groups:
Admin: Has permission to manually create users in the admin interface.
You aslo need to give a admin user permission to log in to the admin screen by setting this user to is staff
.
Model Developer and Annotators: Keep the permissions blank as they don't have access to the admin interface.
The user interface will be available at: http://localhost:3000/
The HALLO annotation tool is based on a modular design and can be flexibly deployed on cloud servers.Depending on the specific network topology, multiple deployment options can be implemented. Each service can be deployed on a separate server or a centralized deployment on a single server using a NGINX web server. Below is a brief description of two deployment methods.
In this scenario, the application is deployed on a server with a public IP, and users can directly access the address or domain name of this server to use the service.
If the service needs to be deployed in a network with an ingress controller, either using a sub-domain or domian subdirectory to access the service, some configurations in the code need to be adjusted.
In the configuration of the front-end service, you need to add the corresponding homepage
configuration in the package.json
file.
In the Django settings of the backend service, you need to adjust the corresponding MEDIA_URL
, STATIC_URL
and LOGIN_REDIRECT_URL
configurations.
The basic setup for this deployment is already done in the code base, and a minimal docker-compose.sfu.yml
file is provided to demonstrate how to run this configuration.
If you need to remove the software completely, or want to reinstall a fresh version, consider the following steps:
Stop the docker containers
docker-compose -f docker-compose.dev.yml --env-file .env.dev down
List the images that were used by HALLO
$docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
hallo-annotation_frontend latest 900f03c5b6d1 About an hour ago 2.03GB
hallo-annotation_backend latest 5b9eb01ed370 About an hour ago 3.26GB
postgres 13 b67cf799bada 12 days ago 373MB
dpage/pgadmin4 latest 40a516ee7dea 3 weeks ago 341MB
Delete the images by copying the image ids after command docker image rm
, for example:
docker image rm 900f03c5b6d1 5b9eb01ed370
Remove the volum (note that this will clear the database):
docker volume rm hallo-annotation_db_data hallo-annotation_pgadmin_data
Use system prune
to clean up the system (optional):
docker system prune