[RFC] Transcription: how do volunteers choose which set of volumes to work on? (subject set selection)

eatyourgreens commented 4 years ago

Package lib-classifier

Is your feature request related to a problem? Please describe. For transcription projects like Anti-Slavery Manuscripts, AnnoTate or Worlds of Wonder volunteers have the option of choosing which set of volumes to work on before they begin transcribing.

Describe the solution you'd like Panoptes supports subject selection by subject set for workflows that have workflow.grouped set. Simply add a subject_set_id parameter to the subjects request and subjects will only be selected from that set.

This allows the same transcription workflow to be used with different sets of subjects and might be the easiest option for a project team to manage. Historically, grouped selection is also the approach we've taken for transcription projects such as Old Weather, Science Gossip or Operation War Diary.

Describe alternatives you've considered Grouped selection isn't currently supported in PFE* so project teams work around it by cloning workflows. Worlds of Wonder is an example of this approach. The problem here is that the same transcription tasks have to be managed across multiple workflows in the project builder, as opposed to setting up your tasks once then applying them to different subject sets.

if workflow.grouped is true, PFE picks a random subject set for a workflow, then serves you only subjects from that subject set. https://github.com/zooniverse/Panoptes-Front-End/blob/e4b7135036a043f8ae1eb7d7498859cbdf87cc75/app/redux/ducks/classify.js#L25-L30

srallen commented 4 years ago

Could you link to where in ASM workflow.grouped is used? I can't find it: https://github.com/zooniverse/anti-slavery-manuscripts/search?q=workflow.grouped&unscoped_q=workflow.grouped

eatyourgreens commented 4 years ago

It's set on the workflow in Panoptes, so subject requests that don't supply a subject_set_id param will receive 422 responses with a message that the required subject_set_id parameter was missing from the request.

eatyourgreens commented 4 years ago

Specifying a subject set via the URL is another option that sounds much better.

wgranger commented 4 years ago

ASM was initially set up to allow subject set selection, but it was commented out in the initial launch https://github.com/zooniverse/anti-slavery-manuscripts/blob/d752ec4269e24283e1fd69903b7a607c8ad5c865/src/components/Home.jsx#L113

srallen commented 4 years ago

We are leaning toward using the URL for all kinds of selection, and it was discussed in this now closed issue: https://github.com/zooniverse/front-end-monorepo/issues/806#issuecomment-490527505

basically support URLs like: /projects/:owner/:project-name/classify/workflows/:workflow-id/subject-sets/:subject-set-id/subjects/:subject-id where subject set and subject are optional. We have a UI for workflow selection, but no design or plan for the subject set selection. Down to the subject selection is for the CitSci use case and they will provide their own UI for their projects.

eatyourgreens commented 4 years ago

URLs were how selection was handled in readymade too, with the caveat that subject ID could only be specified by admins and project team members.

Even if the subject set ID is specified in the URL, you'll still need to tell Panoptes to use it during subject selection, which is where you'll need to have workflow.grouped set.

srallen commented 4 years ago

We may or may not want to continue doing it that way. It'd be worth a discussion with back end.

wgranger commented 4 years ago

ASM has code to set the subject_set_id when asking Panoptes for subjects. Initially, the user would select a subject set on the main page and Redux would choose that subject set when making calls to Panoptes.

However, since we are no longer allowing users to select a subject set on the frontend, we now provide Panoptes with a random subject set from the linked sets.

eatyourgreens commented 4 years ago

I added something simple to the workflow model, just so we can load ASM subjects in the new classifier. Since there was only one subject set, I plucked the first ID out of the links array.

eatyourgreens commented 4 years ago

We may or may not want to continue doing it that way. It'd be worth a discussion with back end.

It'd be useful to know why the grouped flag has to be set before a subject set can be specified. I'd assume so that people can't override Designator by specifying a set, but I'm speculating without knowing what went into that original decision.

snblickhan commented 4 years ago

Note: volunteer ability to select a subject set to work on isn’t specifically stated in IMLS, but it’s something that would result in less work on our end.

If we don’t do it, we know people will keep cloning transcription workflows, and since we know that for the collaborative transcription each new workflow will require a Caesar setup, it might be a question of where we want to expend the effort to set ourselves up best for the long term.

That being said, because this (subject set selection) is a planned goal for the new classifier, ok to not include in the text tools for now, but we'll need to communicate to early users of the collaborative transcription tool that it's a forthcoming feature.

eatyourgreens commented 4 years ago

If we don’t do it, we know people will keep cloning transcription workflows, and since we know that for the collaborative transcription each new workflow will require a Caesar setup, it might be a question of where we want to expend the effort to set ourselves up best for the long term.

Also note that if you run a project in multiple languages, cloning workflows means your translators have to manage multiple translations of those workflows too.

I think the best solution will be one that involves the project team maintaining a single transcription workflow for all their archival material.

eatyourgreens commented 4 years ago

Here's an example of a project that is cloning workflows as a workaround for selecting a subject set to work on: https://www.zooniverse.org/projects/edh/rainfall-rescue

eatyourgreens commented 3 years ago

We've got subject set selection now, as part of workflow selection for grouped workflows (#1731.)

zooniverse / front-end-monorepo

[RFC] Transcription: how do volunteers choose which set of volumes to work on? (subject set selection) #1523