kahst / BirdNET-Analyzer

BirdNET analyzer for scientific audio data processing.
Other
843 stars 155 forks source link

Mismatched annotation tables caused by recordings without annotations #452

Open blegottesman opened 1 month ago

blegottesman commented 1 month ago

Describe the bug

I am training a custom model with two classes: HAAM (Hawaiian 'Amakihi) and noise. To generate my training, validation, and testing data, I generated three random subsets of 2-minute files, one for each dataset. After training the model, I ran it on my validation dataset, which was composed of 42 files.

In BirdNET, I chose to "Combine selection tables". The resulting BirdNET output selection table found 46 detections from 13 of the 42 files.

When I loaded in the input files (Select input directory (recursive)), I noticed that the file order matched the order of the files that I manually annotated in Raven.

However, when I ran the model and saw the list of files at the bottom of the program, I noticed that the files were not analyzed in the same order, and this lead to the selection tables not aligning. The Start Times and End Times did not match up because the file order was different. I think this is due to the fact that I am running the model on 7 CPU threads. I switched to run it 1 by 1 and it looks like the order was corrected.

So, is there a way to keep the correct file order even when running in parallel? I think this would be very helpful for larger batch analyses.

Thank you

To Reproduce Steps to reproduce the behavior: For the GUI:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error For the CLI provide us with the arguments used.

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

max-mauermann commented 2 weeks ago

I created a pull request for a fix, that ensures that the combined selection tables has the same order the input files.

If you have the option of running from the command line you might be able to workaround this until the fix is released, as the analyze script does not mess up the ordering of the result files.