Imageomics / pybioclip

Python package that simplifies using the BioCLIP foundation model.
MIT License
12 stars 3 forks source link

Allow providing custom bins for summing scores #29

Open hlapp opened 1 month ago

hlapp commented 1 month ago

The --rank option allows summing scores over a particular rank. In essence, the taxonomic groups at that rank serve as bins for scores, and scores are then summed over all predictions that fall into the bin.

It would be useful to allow providing custom 'bins' for this purpose, especially when supplying a custom list of classes between which to predict, which cannot be used in conjunction with --rank (because we can't really know how to bin by a given rank).

One way to provide a custom binning when using custom classes would be a simple mapping file (custom class to custom bin).

johnbradley commented 3 weeks ago

Ideas: First column is what the user wants to bin. Second column is the name of the bin for that item.

Species,bin
Orthomiella rantaizana,bin1
Orthomiella sinensis,bin2
Genus,binName