hltcoe / turkle

Django-based clone of Amazon's Mechanical Turk service running in your local environment.
https://turkle.readthedocs.io
Other
147 stars 46 forks source link

mechanism for returning to and editing completed HITs #47

Open TomLippincott opened 4 years ago

TomLippincott commented 4 years ago

@charman @cash @vandurme

Is there such a mechanism, perhaps related to the CADET-style correction of existing NER tags etc? What I'm envisioning is: tasks have a boolean switch (e.g. "persistent") that, if set, keeps it in a user's landing page even if they've completed it, so they can go back and edit.

If not, does anyone see a particular reason not to have this functionality, or that it would be difficult to implement? If not, I'd take a shot at it.

FYI this is so that humanists can do the sort of annotation they're used to, but with very easy pivots to crowdsourcing (and the huge advantage of having people willing and able to implement interfaces for new data etc).

vandurme commented 4 years ago

No. This has been a requested feature, delayed in part as this software was long intended to keep strictly with the behavior of crowdsourcing. We agree it would enable workflows that are more common with experts, who may often do multi day sessions and as you say, wish to potentially revise.

   “We gladly accept PRs”

Otherwise I’d anticipate by end of 2020, it has been on the roadmap.

On Sat, May 2, 2020 at 7:43 AM Thomas Lippincott notifications@github.com wrote:

@charman https://github.com/charman @cash https://github.com/cash @vandurme https://github.com/vandurme

Is there such a mechanism, perhaps related to the CADET-style correction of existing NER tags etc? What I'm envisioning is: tasks have a boolean switch (e.g. "persistent") that, if set, keeps it in a user's landing page even if they've completed it, so they can go back and edit.

If not, does anyone see a particular reason not to have this functionality, or that it would be difficult to implement? If not, I'd take a shot at it.

FYI this is so that humanists can do the sort of annotation they're used to, but with very easy pivots to crowdsourcing (and the huge advantage of having people willing and able to implement interfaces for new data etc).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hltcoe/turkle/issues/47, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACMFB5S372KJMMZQZGASALRPQBOPANCNFSM4MXUV2VQ .

--

  • ben (mobile)
TomLippincott commented 4 years ago

Great, just wanted to be sure it's not already implemented or under-way: I'll take a shot at it, eta hopefully this month.

TomLippincott commented 4 years ago

@cash @charman

Basic plan (subject to feedback, will start on it this afternoon): add a field to the Project class, "persistent_for", similar to "worker_permissions", for specifying a list of workers for whom this project 1) always shows up in their landing page, whether or not they've completed it (with some indication of completion-state), 2) when HITs are rendered, the view first queries for existing annotations of the HIT by the worker, and pre-fills with any results.

vandurme commented 4 years ago

A wrinkle to be aware of for robust "pre-fills": note that one may employ randomness in layout decisions in the task's .html. For example: you wish to display 2 items to be annotated, such as "which is more positive sentiment", and you randomize the order on the screen. I don't believe we current have to save layout decisions in the submission, and even if we did, interfaces aren't written presently to check a possibly saved value for that decision, so an interface could change between sessions as a corner case.

Are you thinking turkle support for persistence will be robust for existing tasks, or will developers need to code their .html for it specifically?

On Sun, May 3, 2020 at 9:31 AM Thomas Lippincott notifications@github.com wrote:

@cash https://github.com/cash @charman https://github.com/charman

Basic plan (subject to feedback, will start on it this afternoon): add a field to the Project class, "persistent_for", similar to "worker_permissions", for specifying a list of workers for whom this project 1) always shows up in their landing page, whether or not they've completed it (with some indication of completion-state), 2) when HITs are rendered, the view first queries for existing annotations of the HIT by the worker, and pre-fills with any results.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hltcoe/turkle/issues/47#issuecomment-623110826, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACMFB5MIV7QSTIPVWHT2XLRPVW37ANCNFSM4MXUV2VQ .

cash commented 4 years ago

This is not going to be a simple feature to add. Some things to consider:

  1. Anonymous HITs cannot support this so there will need to be logic for that.
  2. HIT results now need a field on whether it is complete or not. This field needs to be used in the admin backend whenever we show progress on a batch or for downloading results.
  3. Between assigned but not started HITs and now incomplete HITs, we will need to change how we notify the user of these. We're currently using notifications on the top of the main page - 1 per HIT. I think we'll need to change that to maybe a central page that lists all the user's unstarted and incomplete HITs with a reminder message.
  4. What does it mean for a user to return a HIT that has been partially completed? I guess we delete the data.
  5. How do we handle expiring an assignment? Maybe we don't expire incomplete HITs or have a different expiration timer on them?
  6. This will likely require an additional button beyond just save. Decisions that we make here could result in complete incompatibility between MTurk and Turkle templates and I don't think we want that.
  7. And finally the hardest problem is that most of our templates rely heavily on JavaScript for rendering and submission. I recommend starting with non-JavaScript templates first and get everything working there. We can add an option at the Project level to turn on this draft capability. To support Javascript templates, we'll need to add additional APIs for these templates to query for incomplete results.