Open sammlapp opened 2 years ago
For inference model.predict(audio..)
we currently output scores for each class for audio files/segments of an audio file. People in acoustics often use 'selection tables', in the format used by Raven Sound Analysis Software to store annotations when they are annotating an audio file. Basically a tab-separated text file that stores annotation information (when draw boxes around a sound) of an audio file e.g.:
Selection View Channel Begin Time (s) End Time (s) Low Freq (Hz) High Freq (Hz) Species Notes
1 Spectrogram 1 1 0.459636349 2.298181746 4029.8 17006.4 GWWA_song
2 Spectrogram 1 1 6.705283212 8.246416853 4156.6 17031.7 GWWA_song
BirdNET can output selection tables.
Junshang has noted that there are many choices to make in how to draw the annotated boxes
We have ongoing discussions about creating standardized annotation formats. We could enable Raven format exports for now, but this will be an ongoing conversation
ie model.annotate(audio,score_threshold,max_annotation_length) or annotations.clip_df_to_raven(clip_df_predictions)
p.s. should there be a class AnnotatedAudio that contains both the Audio and BoxedAnnotations?