zooniverse / aggregation-for-caesar

Apache License 2.0
9 stars 15 forks source link

Adding the ability to specify classes to filter user-skill calculation #796

Closed AgentM-GEG closed 3 weeks ago

AgentM-GEG commented 1 month ago

Context: The current version of the user-skill calculation is done on ALL detected classes present within a task (either the mean skill or that skill for all classes be above a certain skill_threshold). This creates a situation where, for a task with large number of classes OR imbalanced datasets, the user has to see at least N images per class before they get a chance to even be considered for leveling up.

Motivation: Research teams should be given the opportunity to provide specific classes using which they can judge the leveling up decision.

This PR:

An example caeasar config looks as such: .../reducers/user_skill_reducer?mode='one-to-one'&count_threshold=5&focus_classes=['1', '2']&strategy='all'&skill_threshold=0.2

AgentM-GEG commented 1 month ago

tagging @ramanakumars as well for visibility and crosschecking.

lcjohnso commented 1 month ago

Hi @CKrawczyk -- Would you mind reviewing this PR?

AgentM-GEG commented 3 weeks ago

@CKrawczyk I added a test for the focus_classes behavior and pushed those changes. I also changed a little bit of the reducer_wrapper code where the focus_classes argument is being parsed appropriately. Let me know how these changes look.

AgentM-GEG commented 3 weeks ago

@CKrawczyk @lcjohnso , thank you! I am happy for it to be merged whenever works for either/both of you.