zooniverse / front-end-monorepo

A rebuild of the front-end for zooniverse.org
https://www.zooniverse.org
Apache License 2.0
104 stars 29 forks source link

Beyond Borders: user losing transcriptions & being re-served subjects #2574

Closed snblickhan closed 2 years ago

snblickhan commented 2 years ago

FEM

Describe the bug

Beyond Borders user losing Transcription Task classifications. They are showing up in Recents, and he's posted Talk comments, but the subject has now been re-served multiple times, never with an "Already Seen" banner, and his previous transcriptions don't show up in the dropdown.

Full Talk thread here: https://www.zooniverse.org/projects/mainehistory/beyond-borders-transcribing-historic-maine-land-documents/talk/4453/2153848?comment=3672839

Applicable Panoptes resource IDs (project, workflow, etc) to demonstrate the issue: Project ID: 12856 Workflow ID: 18383 Caesar workflow: https://caesar.zooniverse.org/workflows/18383#summary

Expected behavior

Once a transcription is submitted, the classification data should persist with the subject. If the subject is delivered to the user again, an "Already Seen" banner should appear.

Device information (taken from Talk report)

Desktop (please complete the following information):

Additional context

In the Talk post the user notes he uses wired hi-speed internet so likely not a connectivity issue. I also checked the Designator queue to see whether the user has completed the entire workflow, and there were 5 subjects in his queue for this workflow, so that isn't the case.

eatyourgreens commented 2 years ago

This sounds like it could be a problem with Panoptes, rather than the front end code. I think the first step towards debugging it should be to try and get the duplicate classifications for this volunteer and subject, from the project. Each classification will have the subject selection state attached, including already_seen state and which selection strategy Panoptes used for that subject.

eatyourgreens commented 2 years ago

Designator has no available subjects for them now, so they've finished that workflow: https://designator.zooniverse.org/api/workflows/18383?user_id=1851320

That still doesn't explain why they aren't seeing Already Seen banners, or why they aren't seeing their previous work on the purple lines.

@snblickhan a question about collaborative transcription: If I transcribe a page, then I'm later shown it again as an Already Seen subject, am I allowed to see and approve my own previous work? That seems like it might be a bug, if it is allowed to happen.

shaunanoordin commented 2 years ago

@snblickhan query: according to your report, the missing Classifications are showing up in the user's Recents page ; but according to this post, the Classifications aren't appearing in the OP's Recents page?

If the Subject isn't appearing in Recents, it may lend more weight to the "this is some Panoptes shenanigans" assessment

eatyourgreens commented 2 years ago

Panoptes won't update recent classifications if a classification subject is either retired or already in your recents, so that could be a case of him submitting multiple classifications for a subject that he's already classified.

snblickhan commented 2 years ago

@snblickhan query: according to your report, the missing Classifications are showing up in the user's Recents page ; but according to this post, the Classifications aren't appearing in the OP's Recents page?

If the Subject isn't appearing in Recents, it may lend more weight to the "this is some Panoptes shenanigans" assessment

In the contact@ email he sent back in December, he noted that there were classifications showing up in Recents. When this was first reported, I had the suspicion that he'd finished the workflow, so @eatyourgreens taught me how to check the user queue. At the time (last month), there were still subjects available for him to work on. My guess would be that since then he's completed the workflow, hence classifications no longer showing up in Recents.

eatyourgreens commented 2 years ago

BarrowTH reported last weekend that he submitted a full transcription of subject 62046966 on 2nd Jan, then a single line when he received the same subject as a duplicate on 3rd Jan.

When I load that subject ID in the classifier, I can see a single purple line (line 89) with his user ID attached (1851320.) All the other purple lines are transcriptions from other volunteers. I can't see any of the transcription work that he submitted on 2nd Jan. https://www.zooniverse.org/projects/mainehistory/beyond-borders-transcribing-historic-maine-land-documents/classify/workflow/18383/subject/62046966?demo=true

I'm going to escalate this as high priority, since it looks like classifications might be lost.

EDIT: that first classification isn't listed in Caesar, for this subject. https://caesar.zooniverse.org/workflows/18383/subjects/62046966

EDIT: It's also missing in Panoptes, meaning it might not have been sent by the classifier at all.

eatyourgreens commented 2 years ago

2598 and #2600 add Sentry logging to points in the classification queue where classifications can be silently dropped, without saving them to Panoptes.

eatyourgreens commented 2 years ago

https://github.com/zooniverse/panoptes/pull/3760 updates Panoptes to allow classification payloads larger than 1MB.

It might be useful to find out how large classifications are for Beyond Borders, to get an idea whether a size limit might be the problem.

eatyourgreens commented 2 years ago

Here’s another subject where it looks like classifications have disappeared. https://www.zooniverse.org/projects/mainehistory/beyond-borders-transcribing-historic-maine-land-documents/classify/workflow/18383/subject/60866790?demo=true

https://caesar.zooniverse.org/workflows/18383/subjects/60866790

https://www.zooniverse.org/projects/mainehistory/beyond-borders-transcribing-historic-maine-land-documents/talk/4453/2295906?comment=3767294&page=1

eatyourgreens commented 2 years ago

This never got updated after we investigated this bug. Here's a rough summary:

mcbouslog commented 2 years ago

I don't think this should be closed, but is it ok to remove the high priority label?

lcjohnso commented 2 years ago

Resolved!