zooniverse / designator

Smart task assignment system
2 stars 1 forks source link

Training subject ratios don't update as a user actively classifies subjects #85

Closed camallen closed 5 years ago

camallen commented 5 years ago

Counts of a user seen_ids are cached when the user first requests subject data via https://github.com/zooniverse/designator/blob/a2115b936012918b00ebd98e5ec0254051918c1e/lib/designator/user_cache.ex#L50

These static counts are then used to determine if a training subject should be selected to send to the user via the user cache and don't change until the cached record is purged (TTL 5 mins) https://github.com/zooniverse/designator/blob/05d2f7ace989144f35e8a68c05c1418471768147/lib/designator/selection.ex#L67 and used in each selector stream to determine training ratios, e.g. https://github.com/zooniverse/designator/blob/df6f2f89eaa815e2b4667798496a5f412b4eebcc/lib/designator/streams/training.ex#L6

Noting that all designator training stream ratio selectors are impacted by this issue.

As it stands we don't have a user's subject information from the Panoptes API being updated based on the received classification subject ids (currently a no-op). We could add this functionality if the designator public API was extended to accept incoming ids (with auth) https://github.com/zooniverse/Panoptes/blob/1cf49c6779346578551d40216f6a4eb083f283f5/app/services/designator_client.rb#L30

Alternatively we can use the proxy stored metric recently_selected_idsupdated when a user is actively classifying. Though this may run the risk of page refreshes pushing a user past the desired sim ratio which may impact on desired training results. https://github.com/zooniverse/designator/blob/a2115b936012918b00ebd98e5ec0254051918c1e/lib/designator/user_cache.ex#L78

Finally we could re-load the seen subject ids from the canonical db but i don't like this solution due to the extra db load.

amy-langley commented 5 years ago

What is the actual impact of this on the behavior the user sees? Could we improve the situation quickly by simply reducing the TTL of the cached record?

hughdickinson commented 5 years ago

As an example use case. We have a project for which it is really essential that the first few (~3-4) subjects that a volunteer sees are training examples, and we use the general feedback tool to help volunteers get a good feel for the classification task. However, our training subjects are in fact simulated and what we really don't want is for volunteers to spend the first 5 minutes classifying simulations and seeing modal feedback that they must dismiss every time. The ideal case would be to slowly reduce the fraction of training subjects selected so that after about 20 classifications the training stops. I think that was the intended functionality of the training framework, but the caching has thrown a spanner in the works.

camallen commented 5 years ago

What is the actual impact of this on the behavior the user sees? Could we improve the situation quickly by simply reducing the TTL of the cached record?

Yep - we can. It's a function of how long each subject takes to classify. If they classify say 10 subjects before the TTL kicks in and the sims are meant to stop after 10 they'll still get the original ratio. Probably ok but not ideal and depends on the use case as hugh points out.

Also noting that TTLs are set at the system level across all workflows.

zwolf commented 5 years ago

Just to clarify: The behavior whereby the delivered sim percentage is adjusted relative the the number of subjects a user has seen is in place for Planet Hunters and Space Warps (example). The training stream is intended to generalize this functionality but does so with a different kind of input (training_set_ids + training_chances) and needs to be documented.

So then Space Warps/Planet Hunters schemes never worked properly for quick classifiers, and since there's no emitting/retrieving that new user_seen data, they would continue to get the wrong proportion for the remaining length of the TTL. If that TTL value is too long, it seems like the only way to properly address this would be for Panoptes to emit notifications to Designator whenever a user sees a subject, an eventuality that was obviously foreseen by that no-op add_seen method.

The 5 minute difference was the price we were willing to pay in the past. If this is now unacceptable, either that number comes down or we try to get as close to live as we can by leaning on Panoptes.