QutEcoacoustics / baw-server

The acoustic workbench server for storing and managing ecoacoustic data. Manages the structure and audio data. Provides an API for clients access.
Apache License 2.0
9 stars 4 forks source link

Feature: Customizable annotation download #311

Open atruskie opened 7 years ago

atruskie commented 7 years ago

Our annotation download functionality, while critical, is often a source of much frustration to users.

The frustration stems from:

I reason that because we are trying to create a data export format that suits multiple stakeholders (programmers, data scientists, general scientists, novice users) we end up with a selection of columns that aren't great for any of them.

Proposal

Allow customization of the attributes/columns included in the export.

Details

For the annotation download page, provide a subsection that allows a user to customise the output. I'm thinking a series of checkboxes with different options like:

Persistence

The choices made by the user can be persisted in their user profile with a button like save my choices and then reset with a reset to default choices button.

User stories

From @karlinaInd:

Currently, when downloading annotations of a single site, in a single project, the website directly downloads all annotations. At times, only groups of annotations of a single day, or a series of days, are required. Therefore downloading the whole annotation set (especially at sites which may have thousands of annotations) is no longer effective.

It could be beneficial to create a feature to create grouped annotations in a single site. Therefore the user can download annotations that are only done on a single time period, such as in a single hour, a single day, or maybe even only download annotations that are longer than 10 seconds, etc. This might make the analysis easier for a user.

tsheringde commented 7 years ago

Do we need to keep audio_recording_uuid? If the other tags column can be broken into sub-categories under non-biophonic sounds. Just a suggestion as it might complicate further. I also agree the too many date and time columns split and combined versions but I think it is due to excel limitation (as explained by Ant.)

atruskie commented 7 years ago

The audio_recording_uuid is important because it is the identifier for the files on disk. Anyone doing anything advanced (e.g. programmers and data scientists) could use this column to do a wide array of things, for example, like downloading the associated false colour images, or pre-generated indices for a recording (or a segment of a recording) - that would be something I think you in particular @tsheringde would be interested in.

Maybe we could make the audio_recording_uuid unchecked by default, with the possibility to include it if a user wants it.