Watts-Lab / task-mapping

An ongoing map of the tasks in the team performance literature.
http://taskmap.seas.upenn.edu/
4 stars 2 forks source link

πŸ—ΊοΈ 🐘 Creating a Public-Facing Website (taskmap.seas.upenn.edu) #382

Open xehu opened 1 year ago

xehu commented 1 year ago

Update for Discussion on 6/8/2023

We currently have a working version of the Task Map website fully linked up to our vanity URL (and not just published on Shiny): https://taskmap.seas.upenn.edu/

This website provides basic information about the Task Map and provides users with a simple way to download and interact with the data.

Requirements for Next Steps (Spec for Full Community Contributor Pipeline)

I created a document to "spec" out the next steps of the website, here: https://docs.google.com/document/d/1zH4jPVjZsd3ay4JgWZntv_3kUCk9DMR7FvZp2xbIvHo/edit?usp=sharing

The major thing missing from the current version is an easy method of adding to the Task Map. After writing out and thinking through the requirements, I believe we will need to build at least 4 distinct pieces:

  1. Task (Row) Submission Form --- for people interested in submitting new tasks/rows. There are, in theory, two ways to submit new rows (which are analogous for columns); "METHOD 1" is to submit a new raw row of numbers, representing a task with complete ratings; "METHOD 2" is to pose a new task, and have us handle all the ratings. Ideally, the system would allow the user to choose between these methods, and provide the appropriate interface for each one. However, there are a few caveats...
    • 1a. It is not clear whether users should be able to append a raw row (METHOD 1). That is, 23 of the current 24 columns are currently measured via a very specific instrument --- asking raters in our panel to rate them according to certain questions, and then averaging the results. If someone else simply adds their own data, there is no guarantee (in fact, we can be quite sure of the opposite) that the community contributor will be using the same "instrument" to measure those columns. This compromises the integrity of the Task Map.
    • Ideally, then, users should use METHOD 2, in which they submit a writeup of the Stimulus Complex/Goal Directives of new tasks (e.g., into a Qualtrics form), and we do the ratings for them.
    • However, this requires handling different rating mechanisms and recombining columns --- see 3c, below.
    • 1b. (Optional) For METHOD 2, we should have automated validation of written tasks (what if people submit junk, and then we 'rate' it?)
  2. Automated Task Uploading Pipeline --- once users submit a new task writeup via METHOD 2, the task needs to be added to our official lists of tasks and saved in the repository; currently, I have some scripting that pulls the data from the Qualtrics form and then reformats it, but I run the code and do all the checks manually.
  3. Automated Task Rating Pipeline --- where we detect that new tasks have been submitted and ping the panel. Currently, this entire process is semi-automated, but every step (from adding the tasks online, to notifying the workers, paying them, etc.) is triggered manually.
    • 3a. This rating pipeline also needs to be substantially updated to handle two use cases: one for adding a new task, and one for adding a new column. Depending on what is being added, the pipeline will ask raters (i) one question at a time for all tasks (when 1 new column is added); or (ii) 24+ questions at a time for a single new task (when 1 new task is being added).
    • 3b. It also needs to handle edge cases for manually-entered dimensions. For example, suppose we start with 102 tasks and 24 dimensions. A user submits a new column via METHOD 1; they append an already-rated column via some black-box method (e.g., expert rating). Then we have 102 tasks and 25 dimensions. When a second user submits a new task, the 25th dimension will not have a valid value for the 103rd task (because it was computed manually by a user). We will need to either store that as an β€˜NA’ or reproduce the method by which the new column was determined.
    • 3c. It needs to handle different rating mechanisms. What if some columns need to go through MTurk, some through ChatGPT, and some through other means? Currently, it's "easy" to rate a new task because all of the column ratings were obtained the same way. But if we are going to support different instruments in the future, we will have to separately measure all the different types of columns, and then recombine in the end.
    • 3d. It needs to merge the new ratings with the existing task map (all this is currently performed manually).
  4. Feature (Column) Submission Form --- for people interested in submitting new task features. As with tasks, there are two ways to submit new columns: "METHOD 1" is to submit a raw column of numbers; and "METHOD 2" is to pose a question, and have us handle the ratings. The system should branch to give users a different "flow" depending on whether they want to submit via METHOD 1 or METHOD 2.
    • 4a. METHOD 1 should be a form with automation that does the right type checking (are the values numeric, and is there a valid rating for every task?), asks the user for the relevant information (what does this column measure? how can we reproduce this instrument), and then appends the column to the task map.
    • 4b. For METHOD 2, we need the user to submit the relevant questions for rating. Then, it should trigger the Automated Task Rating Pipeline (3a, case i): asking users 1 question at a time, for all tasks. Once rated, the new column would get appended to the Task Map.
    • 4c. Regardless of method, all reproducible information about task columns needs to be succinctly stored. Currently, we only store the questions used for rating-based columns, and we don’t really support people just appending columns that are calculated any way they like; when I appended one extra column β€” for the economic games β€” I effectively did so manually. The system will break down if we don't establish a more consistent location of explaining how a column is calculated, what underlying data is being captured by the column, and other basic information.

Short Term Solution -- Using GitHub Pull Requests

Another short term solution that has been suggested is to create a Pull Request template on GitHub, and ask potential contributors to simply make a pull request if they wish to modify the Task Map.

Thinking through what this would look like, I feel that a PR-based modification system would require the following:

  1. A new, clean repo. Our current repository includes both the "new" write-ups and the "old" write-ups (V1 of the Task Map); additionally, there is a bunch of code that is specific to the existing pipeline for creating and rating tasks (Pulling write-ups form Qualtrics --> Manual Cleaning --> pushing to Glitch for rating --> Assembling the Task Map). All of this code would only cause confusion for any community contributors, because we would be asking them to use a very different system (Pull Requests) to add to the Map. My proposal is that we take the bare minimum elements of the Task Map Repo --- the map itself, the documentation of the questions, the clean, new write-ups --- and create a "public-facing" repository for posterity. The old write-ups are nice to archive for posterity, but ultimately not recommended for public release.
  2. Only accepting fully-rated rows / columns via PR. A minimally-viable version of the PR system would likely not include an automated rating system. Therefore, we'd only allow people to append full rows or columns to the table ("METHOD 1" from above) --- users would not be able to propose a question, then ask us to get the ratings from the panel and append them. Eventually, we could build out the automated rating pipeline, and allow users to choose whether to append a fully rated row/column or to have us handle the ratings. But at this point, we would have basically built the entire proposed pipeline above.
  3. Validation for new rows/columns --- considering that we would only be able to append full rows and columns, we would need to think about rules for validating the data. Columns are easy to validate --- it just needs to be formatted as a numeric column with the right dimensions. Rows are harder; given the point in 1a (under "Requirements for Next Steps"), it's not clear whether people should be appending raw rows at all, since they don't have access to the original instrument (panel)?
  4. Documentation for new rows/columns --- we need to ensure that documentation for tasks and columns are consistent: see 2 and 4c, above.

This minimally viable solution (using PR's) would cut out some of the engineering for the automated pipeline, but would also require a substantial amount of work to establish the general submission and validation process.

Discussion Questions

After writing out all these requirements, this seems like a substantial amount of engineering work, even for a "minimum," PR-based version. How do we propose moving forward with all these pieces?


Goals

We would like a public-facing artifact for the Task Mapping project that achieves the following goals:

Considerations

@duncanjwatts and @amaatouq, can you please add the appropriate funding sources for this project that we would need to add on any public-facing website?

Name

taskmap.seas.upenn.edu @markwhiting already reached out to CETS to get this process started.

Next Steps

@xehu to set up a Shinyapp (using R) and point it to a GitHub page (I do have some experience with this, but @linneagandhi it would be helpful to see your example as well!)

linneagandhi commented 1 year ago

@xehu just tagging here our higher investment ideas: retool.com, observable.com.

xehu commented 1 year ago

Hey all, here is an update to the public facing website!

I have created a working Shinyapp (with a fun interactive version of the map) here: https://xehu.shinyapps.io/TaskMap/

Open issues / next steps:

markwhiting commented 1 year ago

Re Linking etc β€” I'd use GitHub pages to build a host and embed the shinny app. I think this won't work with the README style hosting. You can find out more about the build options here β†’ https://docs.github.com/en/pages/quickstart

Re Contribution etc β€” we should clean up the repo so that we can accept PRs and run tests to make sure that the new data complies with any data requirements we have. Then people can just clone and PR back etc. This way they can add tasks or questions and we can ensure that nothing breaks our existing analysis or visualizations.

linneagandhi commented 1 year ago

Love the shiny app!! How long did that take to set up?

xehu commented 1 year ago

Thank you!! The app took about an evening's worth of work, so it actually wasn't too bad. I still need to figure out how to link it to our domain URL though, so it's not completely finished! And I had other features I wanted to try to add ... I had this idea of making an app where it allows the user to "select" tasks and fit a live machine learning model. Unfortunately I wasn't able to get that up and running!

On Thu, May 25, 2023 at 3:26β€―PM Linnea Gandhi @.***> wrote:

Love the shiny app!! How long did that take to set up?

β€” Reply to this email directly, view it on GitHub https://github.com/Watts-Lab/task-mapping/issues/382#issuecomment-1563402905, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG3VWKJUCKKNNAXS2XIQH3LXH6W7BANCNFSM6AAAAAAXOK74OQ . You are receiving this because you were mentioned.Message ID: @.***>

shapeseas commented 1 year ago

@xehu some reference points that come to mind:

shapeseas commented 1 year ago

@xehu following up on yesterday's conversation and @linneagandhi 's thoughts and questions on creating "call to actions", I wonder if part of the ask for advisory board group is to support a team in launching this public-facing taskmap as a public good / living archive / open-source project (however we want to frame it). Basically "here's the prototype, but to take it to the next level where we can accept contributions from other research teams we need X, Y, Z" where X is $, Y is potential collaborators, and Z is more money.

We could prepare a one-pager on the vision of the project with details on the team needed to pull it off . I don't think anyone in that meeting will hand us money, but if the ask is clear they may connect us with potential funders or collaborators.

duncanjwatts commented 1 year ago

Definitely something to raise in the discussion.