zooniverse / front-end-monorepo

A rebuild of the front-end for zooniverse.org
https://www.zooniverse.org
Apache License 2.0
105 stars 30 forks source link

[RFC] Dropdown Task #1679

Closed srallen closed 3 years ago

srallen commented 4 years ago

We've identified the dropdown task as being a task that will be continued to be supported, however, needs potential reworking for continued support.

Current implementation

The current implementation on PFE:

Functionality

The current functionality of the dropdown includes:

JSON task structure

TODO: add example of actual task JSON

JSON annotation structure

TODO: add example of actual annotation JSON

Actual usage

These are examples of actual project usage TODO: Link to actual projects using the dropdown in these ways:

Known issues

Performance

Aggregation

UX and design research

tl;dr select inputs / dropdowns are hard to have good UX with and alternatives like text inputs using validation, auto-complete, and/or auto-suggestion or radio buttons are often better choices

UX research resources:

Proposed refactors

UX

TODO: What are the various known uses cases and how often are they used, i.e. date annotation vs essentially searching against a long list of options

Invision references

https://projects.invisionapp.com/d/main#/console/12923997/270495290/preview https://projects.invisionapp.com/d/main#/console/12923997/270495295/preview https://projects.invisionapp.com/d/main#/console/12923997/270537252/preview

Data

I have a series of question that I'd like this RFC to help facilitate conversation to answer:

Action items

eatyourgreens commented 4 years ago

Some discussion already in #1284.

srallen commented 4 years ago

Thanks. I'm going to close that issue in favor of this one since I think I've captured the essentials out of that one already in this one.

beckyrother commented 4 years ago

It will be good to know actual usage. To reiterate general types:

I'd recommend focusing on the Simple dropdown first, since it sounds like there will be specific use cases for it coming up.

I'm looking into UX regarding the recommended number of items in a dropdown. We may also want to include allowing validated text input for longer lists so a user can start typing and doesn't have to scroll all the way to the end of a long list.

eatyourgreens commented 4 years ago

I'm still confused as to why aggregation needs the labels. Question tasks don't pass the answer labels to aggregation. The selected value should work just fine for comparing values across classifications.

https://github.com/zooniverse/caesar/pull/842#issuecomment-499562244

Do we know why the dropdown task options use a hash as the value, but question task answers use the answer index?

CKrawczyk commented 4 years ago

I agree with @eatyourgreens on this one, the external extractor/reducer in the aggregation repo works just fine without the actual labels, all it needs is a unique value for each option (could be a hash, an index, or the labels).

I think the confusing thing at the moment is the hash value used for the dropdown task can only be found in a workflow data dump, it is never exposed anywhere in the project builder. Here is an freshdesk ticket that came about because of this https://zooniverse.freshdesk.com/a/tickets/2270

eatyourgreens commented 4 years ago

African American Civil War Soldiers has probably the most complicated use of dropdown menus. It's worth talking to @snblickhan as to why they prefer menus instead of free text input, even for entering numbers.

From memory, their volunteers asked for autosuggest for place names, to reduce the amount of typing. The dropdown task will give you that, but the current implementation does mean that the browser must download all possible values that a volunteer might type.

The solution to that would be storing references to options in the workflow tasks, rather than copying the full list into each task. That's going to need work in the project builder. https://github.com/zooniverse/Panoptes-Front-End/issues/3636#issuecomment-317033717

eatyourgreens commented 4 years ago

Has anyone ever used the dropdown task with free text input turned on ie. annotations that look like:

{value: "This is some text I typed in", option: false}

I can see that being complicated to aggregate, particularly if the annotations for that task also include answers that look like:

{value: "ecc0e62c6c11", option: true}

If no one uses that option, I'd be inclined to remove it and make the dropdown a variant of the single choice answer task, using a combo box instead of a radio button group. Linked menus could still be implemented using the selected value from one menu to dynamically populate another.

CKrawczyk commented 4 years ago

The aggregation is quite simplistic at the moment, it does a counter operation on the list of input values that returns how often each unique element showed up (this is why it does not matter what format they are in) so they can be a mix of user input and hash values without issue.

The more "correct" thing to do would be to treat each user input option as free hand text and pass those into the text reducer, but more thought would have to go into what that aggregation object would look like so flattened csv didn't have mixed data types in a single column.

(I put "correct" in quotes because I am not convinced it would be more useful than the counter method. I think we would need research team feedback to answer that question.)

eatyourgreens commented 4 years ago

Here's a link for a workflow that's probably one of the hardest to work with: https://www.zooniverse.org/api/translations?http_cache=true&translated_type=workflow&translated_id=14384&language=en

The strings dictionary returned in that response is large enough that it locks up the browser when trying to inspect it in dev tools.

eatyourgreens commented 4 years ago

Worth noting that AACWS have uploaded their own custom gazetteer of 19th Century US states, counties and towns. I think Notes from Nature uses a preset gazetteer of modern countries and states that's built into the project builder.

eatyourgreens commented 4 years ago

Some thoughts about asynchronous loading

Say we have 2 menus. Menu 1 has 100 options. Each of those 100 links to a second menu that has 50 options, so 5,000 options total. A volunteer can select from each menu by typing to filter the initial list then selecting a value.

Current implementation

Download all 5,000 options up front. All filtering and selection is done in the browser. The entire set has to be downloaded (once), and held in memory, in order to select 2 values. Taken to its extreme, this gives us the 5.5MB downloads seen in AACWS.

Async menus

Download menu 1 in full. On selection of a value, lazy-load the linked options for menu 2. 150 options have to be downloaded in total. Data has to be downloaded each time we make a new classification (but the browser could cache responses.)

Performance is much, much better and we lose the problem of not scaling as menu 1 grows. Each of the 101 menus would be rendered (as JSON) on the server, and each given its own static URL.

Async filtering

Wait until the volunteer starts typing before requesting only the filtered, matching values for menu 1. Lazy load menu 2, making the request for matching options only after the volunteer starts typing. Only 5 or 6 matching options might have to be downloaded in total.

This could be the fastest option but now we're dynamically rendering the filtered lists of options on the server, with dynamic URLs based on the text that's been typed into the initial combo box.

eatyourgreens commented 4 years ago

Download sizes from AACWS:

mcbouslog commented 4 years ago

I'm not sure this will entirely answer questions related to the dropdown options value as hash, but it's related to potential dependent dropdowns and how the task object is structured. For example, if the initial dropdown is for countries, then there's a dependent dropdown for states, then on states a dependent dropdown on counties, the county options are stored in an object keyed with the country and state values combined, i.e.:

Screen Shot 2020-06-19 at 1 19 03 PM

In the screenshot above countries and states are presets, so instead of 13-digit hashes USA and IL are 840 and IL.

Note in the second object the options for Illinois (USA) counties is keyed with 840;IL. Similarly, for cities (dependent on county) for Cook county, Illinois, USA are keyed with 840;IL;bdd23baaa943d, where bdd23baaa943d is the value for Cook county.

I apologize as the preset values not being hashes makes this explanation for hashes confusing! But hopefully helps a little?

The label could be any (potentially long with spaces or special characters) string, which would/could cause issues as the dependent dropdowns are created (though maybe those issues could be addressed), so instead of using the user inputed label a randomly generated 13-digit hash is used.

Noting if a dropdown is singular list (no dependent dropdowns), or what we're referring to as a simple dropdown, I don't think creating the value as a hash is necessary, though there may be some check on input/creation to prevent duplicate options.

mcbouslog commented 4 years ago

I think this NfN workflow (link to it in lab) is a somewhat typical NfN workflow, including a dropdown for country->state->county.

mcbouslog commented 4 years ago

Small UI point regarding a date dropdown - there might be different use cases within this category, though the different use cases could then fall into a different mentioned dropdown version (i.e. simple dropdown). For example:

This is likely getting a little too detailed, but thought couldn't hurt to mention.

beckyrother commented 4 years ago

Hi all, this is really helpful to be thinking about for future iterations of the dropdown task. However, I'd like to shift a little to a very basic dropdown task, not linked to a combo task or any other dependencies.

Here's an article about dropdown design – they don't give a specific number of recommended items in a dropdown menu, but they do note that the dropdown shouldn't be longer than the viewport if possible (user shouldn't have to scroll).

The article also notes that it's better to allow users to type in familiar data like dates and times, but in this case it would be better to use premade inputs like the date and time pickers to avoid typos and create better data output for research teams. In addition to those date/time pickers, it would be great to be able to give project builders a few pre-built options for common data like location. Is this possible, and are there known repositories of that type of data that could be used so we don't need to build our own?

snblickhan commented 4 years ago

Here's an upcoming use case for a 'simple' dropdown: https://www.zooniverse.org/projects/msalmon/dreadnought-seamans-hospital-admissions-registers-1826-1930

Aim: transcribing handwritten, tabular data Task(s): combo + dropdown

Combo task allows volunteers to transcribe multiple fields on the same 'page' without clicking Next. Dropdown menu allows volunteers to select fields like:

Additionally, there are some fields which include several common responses, such as:

Note that some of these these would benefit from an option to 'add your own' but we can work around that if it's identified as being outside the scope of a 'simple' dropdown task.

Basically, all of these fields will be maximum 31 entries in the dropdown menu, with the exception of age, which I think is ok to have be a text entry field. All of these lists can be generated by the research team and are not reliant upon bringing in external info via csv, etc. Additionally, there are no dependencies between dropdown menus.

7/8 UPDATE: Here are the full fields for the Royal Museum Greenwich project linked above, with an example subject:

Number: numeric Year (note: this field may be unnecessary as each book is 1 year, I believe?): numeric Date of Entry (Month + Day): month + number Name: text entry Quality: alphanumeric Age: numeric Height (Feet + Inches): numeric Place of Birth: text entry (location?) Years at Sea (Navy + Merchant Service): numeric (sometimes includes fractions) Last Services (ship name): text entry Under what circumstances admitted (affliction): text entry (common list might also be an option, e.g. 'Dysentery', 'Fever', 'Headache' -- but not necessary) Remarks as to general conduct, &c.: text entry (formulaic responses here: either 'An orderly Man' 'An orderly Lad' etc. so would really benefit from common list option with text entry for edge cases) Date of Discharge (Month + Day + sometimes Year): month + number(s) How Disposed of, Whether Shipped, Died, Run, sent Home, or expelled: text entry and/or common list Amount of Slops and other Necessaries received: numeric No. of Days Victualled: numeric

Screen Shot 2020-07-08 at 11 45 21 AM

srallen commented 4 years ago

@snblickhan thank you for the examples. For this upcoming project here's how I'd categorize these kinds of dropdown inputs:

srallen commented 4 years ago

@beckyrother Grommet has examples of a few of different ways to do (complex) date inputs. The underlying general component is the MaskedInput and seems highly customizable:

https://storybook.grommet.io/?path=/story/maskedinput--date --- as numbers but could probably be string month names and numbers https://storybook.grommet.io/?path=/story/maskedinput--date-range --- dates as a range https://storybook.grommet.io/?path=/story/maskedinput--date-time-drop --- dates with a calendar UI. Likely not great UI for our use case since typically we're dealing with historical dates? https://storybook.grommet.io/?path=/story/maskedinput--filtered --- filtered selection/suggestion all in one input

eatyourgreens commented 4 years ago

OWD used a calendar picker, seeded from either the date of the page in the subject/group metadata or the last date you entered (local storage) so that you weren't always constantly paging back to 1914. That has its own problems, though: it's hard to find a date picker that works well with keyboard control or screen readers (keyboard control is probably a priority for transcribers.) Fuzzy dates can be hard to enter with a calendar picker. They usually require you to pick a specific day, or range of days, because they're aimed at things like ticket booking sites.

beckyrother commented 4 years ago

Good point about the historical dates. Seems like the first option, validating the input, will be the most useful, as long as we are able to specify the formatting (DD/MM/YYYY vs MM/DD/YYYY).

snblickhan commented 4 years ago

Note: I've added the full list of necessary fields for RMG. @mrniaboc let me know if you disagree with any of my interpretations of the field types/requirements.

mrniaboc commented 4 years ago

From an Engaging Crowds meeting with the Royal Museums Greenwhich it's clear that a date picker function is also desirable. They have 25 date entries to be transcribed on each page and they have worries about people entering in different formats and how that will cause issue for data aggregation. It's clear a date picker would be super useful in this case, and I think may tabular data projects will find it useful too.

srallen commented 4 years ago

@mrniaboc would @beckyrother's suggestion in her last comment meet the requested functionality they're looking for? I think we would want to avoid a calendar UI for a date picker for a few reasons, notably from the usage experience that @eatyourgreens describes OWD as having.

eatyourgreens commented 4 years ago

Sorry for the confusion: the calendar picker worked really well on OWD, where you were recording specific days, but it is hard to find one that's keyboard accessible and can handle fuzzy dates like 'March 1851'. OWD used the JQuery UI date picker, which is very good but unfortunately also very out-of-date now.

srallen commented 4 years ago

Right, I think the keyboard accessibility and fuzzy dates is enough for us to look at a different UI. The masked input UI that I linked to provided by Grommet I think fits the various use cases.

ETA: I think if we can have a single UX for date transcription, we should go for that rather than support a calendar and a text input UI for transcribing dates.

mrniaboc commented 4 years ago

The RMG project data appears to always have day, month, and year (though year is recorded in a separate location), so I think these date picker solutions will work for it.

eatyourgreens commented 4 years ago

The grommet storybook allows invalid dates, so I guess validation would be on us. Their masking function expects month to be entered first, too, which won't work for English dates. If we're having to write our own functions to generate day numbers, we might be better off using existing calendar code. I'm not sure.

Screenshot of the Grommet masked input for a date that doesn't exist: 30th February 2012.

srallen commented 4 years ago

The stories are example of how the MaskedInput can be used, so yes, we would have to write a bit of our own code to meet our use case. A quick prototype could be done in a codesandbox. I'm sure the Grommet team would appreciate feedback on their story examples?

srallen commented 4 years ago

I know we're leaning away from a calendar UI, however, Grommet is soon to be releasing a separate DateInput component. It's available for preview on their storybook: https://storybook.grommet.io/?path=/story/dateinput--form

lcjohnso commented 4 years ago

I've compiled some stats on current Dropdown Task usage.

Projects: 42 total projects (requirement: active + launch approved) use dropdown tasks, but 31 are members of Notes from Nature or NestQuestGo organizations, leaving 11 non-organization projects.

Workflows and Tasks: There are 1896 individual Dropdown tasks across 474 unique workflows. Here, 1534 of 1896 tasks are single dropdown tasks -- the rest are under construction or multiple linked dropdowns (28 w/ 2, 225 w/ 3).

DropdownStats_ndropdowns

Details for Single Dropdown Tasks: median number of options = 34; distribution is broad extending up to ~1000 options (!!!), with peaks at 13, 32, and ~200.

DropdownStats_options DropdownStats_options_zoom

eatyourgreens commented 4 years ago

At the extreme upper end, African American Civil War Soldiers. Their workflow is a 600k download, which takes 17s for me when I visit their page in Chrome. That's the workflow where the dropdown tasks are so large they can break dev tools if you try to expand the objects and inspect them.

shaunanoordin commented 4 years ago

Here's my update from the ongoing Engaging the Crowds work.

TL;DR: I'm working on a new Monorepo simple-Dropdown Task which is trying to respect the old PFE Dropdown's Task and Annotation data models. I'm also trying to figure out how the dropdown UI should look like.

Overview

Context + Scope:

Issues I'm noting, but not addressing in my work:

Dev Notes

Part 1: what I'm up to right now

Part B: next on my plate

Part III: thoughts on moving forward

Fun fact: Google Forms allows their analogous radio-button Single Answer tasks (called "multiple choice questions") to have an "Other: type in whatever" option, but not their Dropdown tasks; whereas we're the opposite. This makes me overthink my understanding of "dropdown" in the general context and in the Zooniverse context.

Part Fish: current Task and Annotation data models

Dropdown task data structure (simple single dropdown per task/non-cascading selection), example:

"T0":{
   "help":"",
   "next":"T1",
   "type":"dropdown",  /* simple dropdown with 3 options, and the ability to type in whatever */
   "selects":[
      {
         "id":"070b610fbf5d9",
         "title":"Alignment",
         "options":{
            "*":[
               {
                  "label":"Lawful Good",
                  "value":"4beef18e9baa"
               },
               {
                  "label":"True Neutral",
                  "value":"db939237aa00f"
               },
               {
                  "label":"Chaotic Evil",
                  "value":"fedae0dd4f96d"
               }
            ]
         },
         "required":false,
         "allowCreate":true
      }
   ],
   "instruction":"Dropdown with an 'OTHER' choice - type whatever!"
},
"T1":{
   "help":"",
   "type":"dropdown",  /* simple dropdown with 4 options, and NO free-answer choice */
   "selects":[
      {
         "id":"dbd3d991fcd9e",
         "title":"Colour",
         "options":{
            "*":[
               {
                  "label":"Red",
                  "value":"7fda9846fd624"
               },
               {
                  "label":"Yellow",
                  "value":"a2c9fbc881d7a"
               },
               {
                  "label":"Green",
                  "value":"baf1ae42d766f"
               },
               {
                  "label":"Blue",
                  "value":"6d13daea5706a"
               }
            ]
         },
         "required":true,
         "allowCreate":false
      }
   ],
   "instruction":"Dropdown with limited choices"
}

Dropdown annotation (classification) data structure, example:

"annotations":[
    {
       "task":"T0",  /* user typed in a free text answer, so we get option: false. */
       "value":[
          {
             "value":"THIS IS A FREE ANSWER",
             "option":false
          }
       ]
    },
    {
       "task":"T1",  /* user chose an answer from the dropdown, so we get option: true */
       "value":[
          {
             "value":"baf1ae42d766f",
             "option":true
          }
       ]
    }
 ]
srallen commented 4 years ago

I agree with distinguishing this as a separate task type, however, as noted:

the simple dropdown, after all, is just a modified Single Answer task (which we already have a very good pattern for) in a more compact UI, with the added wrinkle of the "Other: type in whatever you want" option.

I'm not even sure if we should allow free text entry with the simple type. With the new workflow steps feature, we can advise project to just have a text task immediately after the simple dropdown task to capture any custom free text the project may want to know about. If we go this route, the annotation model can change to just simply be the string value of the dropdown selection rather than a machine unique id and we can drop the boolean value for option which represents whether the string value was a free text entry or not.

That being said, if anyone has an information that dropping the free text entry from the dropdown task is a bad idea, please do share it (@snblickhan or @lcjohnso?). I'm not sure if we have a sense for how often this is used, but again it seems to me that this is another example of the task tying to do too many functions which complicates its modeling and later downstream analysis. I think you're correct in describing the simple dropdown as a different skin on a single choice task which is why I recommend we have a minimum and maximum number of options requirement for it. Too few, and the project owner should just use a single choice task. Too many, and we may redirect them to either the survey task or perhaps whatever form the async long list version takes.

eatyourgreens commented 4 years ago

Does anyone know why the single answer question passes the answer index as the annotation value, but picking a value from a dropdown passes a generated hash? It seems to me like both could pass the index of the chosen answer/menu option, but I'm wondering if there was some technical reason to prefer the hashes.

srallen commented 4 years ago

@mcbouslog shared previously why hashes were used: https://github.com/zooniverse/front-end-monorepo/issues/1679#issuecomment-646844464

The tl;dr answer is because of cascading dropdowns and array indexes aren't unique when you have a series of dropdowns. The simplified dropdown isn't cascading, so this isn't necessary. We could consider indexes as the annotation value for the simple dropdown.