Open eatyourgreens opened 2 months ago
The old home page uses preferences.activity_count
, which ignores any classification where either:
If the ERAS count includes duplicates and retired subjects, that might explain the sudden increase in classification counts for PFE projects. https://www.zooniverse.org/talk/2354/3435274?comment=5657245&page=7
The new code uses preference.activity_count
here:
https://github.com/zooniverse/front-end-monorepo/blob/a4fba3d2143940998b746356961fa7243b419932/packages/lib-user/src/components/UserHome/components/RecentProjects/RecentProjects.js#L54-L61
That leads to inconsistent and confusing UX in the new code, where the same project can show two different numbers for the same volunteer.
After chatting with @yshish and others, another difference we're seeing is that ERAS reports a different number of all projects worked on.
For my account:
For @yshish:
The problem with having launched a new home page, and also changed how projects and classifications are counted, is that people are confused by the change in numbers, and consequently unsure whether they can trust the new stats.
Describe the bug
The new user classification counts, published last night, don’t agree with the counts that were published on the old home page. Some volunteers are reporting differences of thousands of classifications, in total, for their new user stats.
Here’s a couple of examples from my account.
New user stats page stats (all time):
Old home page stats:
To Reproduce
Logged in as a Zooniverse volunteer, go to More Stats and select All Time for the time range. Project classification counts are shown under Top Projects. Compare the new counts with the counts shown on https://pr-7177.pfe-preview.zooniverse.org/#projects. Differences show up for both Ouroboros and Panoptes projects, with older projects (pre-2019 maybe) more likely to show large differences from the
activity_count
stored on your project preferences.Expected behavior
Classification history shouldn't have changed during the move to a new stats API. Total classifications for a given project should match
user_project_preferences.preferences.activity_count
for that project.Additional context
This seems like a problem that should have been caught by spot checking some volunteer accounts prior to launch, or by snapshot testing against the old counts. Generate project classification snapshots for a few thousand sample accounts on the old API, generate the same snapshots with the new API then assert that the snapshots match.
FWIW I use this technique in one of my current software projects to check that backend changes can be deployed to production without breaking research models in the production database. Before a release, the release branch is tested against snapshots generated on the main branch, using real data. It's a very useful technique.