Nachtzuster / BirdNET-Pi

A realtime acoustic bird classification system for the Raspberry Pi 5, 4B 3B+ 0W2 and more. Built on the TFLite version of BirdNET.
https://birdnetpi.com
Other
133 stars 20 forks source link

Localization setting change leads to incorrect unique detected species count #54

Open programmdesign opened 5 months ago

programmdesign commented 5 months ago

Describe the bug If one changes the database language in the localisation settings, previously detected species are double counted with their common names in the current and previously selected localisation languages. E.g., if I switch from English to German, I get. in the stats "Grünfink" and "European Goldfinch". Hence, the number of identified species is wrong after the settings are updated.

To Reproduce Steps to reproduce the behavior:

  1. Go to Tools > Settings > Advanced > Localization
  2. Change to another language and detect birds
  3. Got to Daily stats
  4. See error

Expected behavior Localization should only impact displayed names, not database stats or number of unique species.

Screenshots

image image

Additional context

Your build

Code or log snippets

Nachtzuster commented 4 months ago

Hi, yes that would mess up the stats. Though it would be nice if only the ui's language would change. The way the original authors implemented it, one should not change the language once everything is set up. (the localized common name is quite fundamental to the internal data structure)

This should probably be made clearer in the 'Settings'

arne1921KF commented 4 months ago

This is a fundamental and very common problem of taxonomy - also for scientific names. A database of biological entities (taxa) would need to be implemented to change that behaviour, i.e. usage of a taxonomic backbone.

I doubt I find the time to fix this, but surely there are already working solutions for that.

I would bet there is a RESTful API for birds somewhere which could provide a current taxonomy based on IDs, and without looking at it I bet you a beverage that @kahst can provide a solution basically out of the box. 😄

The basic idea is that one uses a stable ID instead of a name, and has relations behind that ID to deliver a) consistency in case of name changes and b), more importantly, consistency in case of merges and splits considering the ID. (Which happens, rather frequently. We even have a 'taxonomy' of taxonomists for that - we call them splitters and lumbers based on their favoured changes in biological taxonomy...)

programmdesign commented 4 months ago

I fixed this manually, by updating the detected species in the database. However, IMHO there is an "easy" fix: There is already the scientific name of the bird in the database. The front-end can use this name as a key to show language-specific bird names in the front-end only. No additional API, etc. is needed. The database could only hold the scientific name and all other methods/features/front-ends could operate on them. Like in this case, just mapping the scientific name when displaying the front-end to the selected language. The mapping must be already available in the code base, as otherwise the database table could not store both, scientific and language-specific names.

Nachtzuster commented 4 months ago

Ouch, I hope that was not too painful.

Yes, using scientific names internally is the correct thing to do for easy internationalization. As changing this is unlikely to happen, I've added a warning in the Settings page.

lloydbayley commented 1 month ago

If the warning doesn't work in the future, it's kind of a case of 'bad luck' really isn't it. Other than not allowing that to change (disable dropdown) once the db contains entries but that is all sorts of complicated really.... Suggest we close this one off as Solved/Quick-Fixed/Too-Scary. :)