The BirdNet and Perch implementations have been updated:
Both model implementations can now handle missing labels. If Perch or BirdNet or Perch does not know a label, the logits are set to a negative value, corresponding to a sigmoid probability close to 0.
Since BirdNet can only process 3-second audio chunks (with a 48k sample rate), we split our 5-second audio chunks into two 3-second chunks with a 1-second overlap, process both 3-second chunks independently with BirdNet, and compute the element-wise maximum of the logits to make a prediction with BirdNet for the 5-second audio.
The BirdNet and Perch implementations have been updated: