empiricaly / meteor-empirica-core

Core Meteor package for the experiment Empirica platform. This is where you should submit issues.
MIT License
27 stars 13 forks source link

Add "duplicate" field to spawn specified number of identical batches #198

Open hawkrobe opened 3 years ago

hawkrobe commented 3 years ago

Suppose we start a batch with a game count of 20, where each game requires 3 participants.

Three players trickle in, which would be sufficient to begin a game if they were all assigned to the same Lobby.

Instead, they are randomly assigned to different Lobbies, each ending up with only 1/3 players. At this rate, it will take too long before any particular Lobby accumulates enough players to start a game, and crowd-sourced participants will begin to drop out.

Expected Behavior

The player assignment logic should ensure that one lobby completely fills before players are assigned to empty lobbies.

Current Behavior

Players are assigned randomly to lobbies.

Possible Solution

I had assumed the initial lobby assignment logic in /api/players/methods.js would use a "rolling lobby" by default (i.e. starting games as soon as they can be started), but it looks like it only prioritizes games proportional to their size. It wasn't clear to me how to fix this using a custom Lobby.

Starting a game as soon as you have enough players available seems like a desirable default behavior for most Empirica users, and most multi-player designs. From reading a few other issues, I understand that there may be other use cases (e.g. lab experiments when you're able to get 20 people to simultaneously log on and want to make sure they're assigned completely randomly across groups, not in the order they join) but on crowd-sourcing platforms where the throughput is slower and more unpredictable, this isn't realistic. One solution allowing for more flexibility would be to create an additional LobbyConfig setting that specifies the group assignment logic (random vs. rolling vs. size-weighted)?

Steps to Reproduce (for bugs)

  1. Clone any of the demos (e.g. random dot motion)
  2. Create a multi-player treatment with 20 games
  3. Add new player 3 times
  4. Look at assignment to lobbies
Screen Shot 2020-12-05 at 10 18 43 PM Screen Shot 2020-12-05 at 10 21 03 PM Screen Shot 2020-12-05 at 10 21 16 PM Screen Shot 2020-12-05 at 10 21 29 PM

Your Environment

I tried all three demos and two different browsers (Chrome, Firefox) using Empirca versions 0.14.0 and 1.11.1. I'm on Mac OS X 10.15.6.

amaatouq commented 3 years ago

First, I would suggest using "complete" randomization rather than "simple" when you create a batch. The "Simple" approach tosses a coin to assign a new participant to a treatment A or B. This of course can, by randomness, generate unequal number of games across treatments. The "Complete" solves this problem by permutation of the assignment sequence. So if you want three games in each treatment, the assignment takes AAA and BBB within a batch and permutes it into e.g. AABABBB.

Screen Shot 2020-12-06 at 2 04 24 AM

Second, what you see is the desired behavior. Say you have 4 treatment conditions (A, B, C, D, E) in your experiment, and you want to start a batch with 4 games (i.e., one game per treatment). Suppose that each game requires 3 players to start. Then you want each participant to have 1/4 chance to be assigned to each game. Otherwise, the first 3 players will be assigned to the first game (i.e., there will be no randomization across treatments).

What you want to do is to bundle games with different treatments into batches. Say you want to have 5 games per treatment. One way to achieve this is to have 5 running batches, where each batch contains 4 games (1 of each treatment). This will ensure that the participants have a 1/4 chance of being assigned to each treatment. At the same time, participants will be assigned to the games in the first batch until all games have started before they are assigned to the games in the next batch, etc.

amaatouq commented 3 years ago

Here is an example of an ongoing experiment. Notice that all the games in batch 1 will start before participants are assigned to batch 2.

Screen Shot 2020-12-06 at 2 05 45 AM
amaatouq commented 3 years ago

Also, most experiments on Empirica required > 20 Mturk participants to show up simultaneously. @JamesPHoughton and @joshua-a-becker have some interesting recruitment strategies to achieve this. Here is a post by @JamesPHoughton about this: https://forum.turkerview.com/threads/launching-multiplayer-games-poor-click-through.2419/

Here is a related issue on recruiting participants from MTurk: https://github.com/empiricaly/meteor-empirica-core/issues/105

hawkrobe commented 3 years ago

Thanks for your very helpful response!!

I think perhaps the specific use case to think about here is the scenario where I don't have different treatments at all, I just want to run 200 games with exactly the same settings (say, in the simplest case, these are dyadic games, so I recruit 400 turkers to be paired up). This is clearly not the most ambitious use of Empirica but I do think it's a common setting for researchers.

The UI currently suggest that to get 200 games, you enter 200 in the Game Count field (even using complete assignment):

Screen Shot 2020-12-06 at 10 24 02 AM

It sounds like you're saying I instead need to create 200 Batches, each containing one game, in order to get the desired 'rolling' assignment behavior. But there's no way to automatically create large numbers of batches, so this would require clicking New Batch 200 times, no?

What you see is the desired behavior. Say you have 4 treatment conditions (A, B, C, D, E) in your experiment, and you want to start a batch with 4 games (i.e., one game per treatment). Suppose that each game requires 3 players to start. Then you want each participant to have 1/4 chance to be assigned to each game. Otherwise, the first 3 players will be assigned to the first game (i.e., there will be no randomization across treatments).

Again, I totally understand this is the desired behavior for certain kinds of settings where you want to run a small number of very large games. I'm worried about the desired behavior where you want to run a large number of very small games, or only one unique treatment, where we don't have the same randomization concerns and instead have concerns about the recruitment 'flow'. Does that make sense?

hawkrobe commented 3 years ago

What you want to do is to bundle games with different treatments into batches. Say you want to have 5 games per treatment. One way to achieve this is to have 5 running batches, where each batch contains 4 games (1 of each treatment). This will ensure that the participants have a 1/4 chance of being assigned to each treatment. At the same time, participants will be assigned to the games in the first batch until all games have started before they are assigned to the games in the next batch, etc.

I think this idea suggests a best-of-both-worlds approach where the semantics of Game Count are a little clearer in the UI (i.e. it simply corresponds to how many times you want a particular treatment to be hit and you don't need to manually spawn 200 batches) but on the backend, you parse this up into batches where participants get randomly assigned to treatments and the 2nd game in a given treatment doesn't start until the 1st game in all treatments is filled. This gives the desired behavior in both settings, no? (i.e. if you have 200 games using only one treatment, it'll start the 2nd game immediately after the 1st fills; if you have 2 games using 4 unevenly sized treatments, it'll do the nice permutation behavior you described but then loop back around after all 4 treatments are filled.)

amaatouq commented 3 years ago

I see your point. And yes, I am suggesting (for your particular case) to run 200 batches, each with 1 game... or perhaps 20 batches, each with 10 games. This latter will reduce the number of times you need to duplicate the batch and perhaps it won't take too long for games to start. An alternative is to write a script (in any language you want) that inserts 200 batches into the database directly.

I agree that this setup is not ideal when you want to run a large number of games with one unique treatment. But this is a weird use case, I think; if there is no random assignment, then there is no experiment, no? (unless it is within-subject)

Of course, there is no reason for Empirica not to support this. Perhaps Empirica can be used to collect observational data, individual-level labeling tasks, purely within-subject designs, etc. One easy solution is to add a field when you click "duplicate" that asks the user to input the number of duplicates to create. In that case, you'll just click "duplicate," enter "200" and you should be all set.

@npaton, what do you think?

amaatouq commented 3 years ago

I think this idea suggests a best-of-both-worlds approach where the semantics of Game Count are a little clearer in the UI (i.e. it simply corresponds to how many times you want a particular treatment to be hit and you don't need to manually spawn 200 batches) but on the backend, you parse this up into batches where participants get randomly assigned to treatments and the 2nd game in a given treatment doesn't start until the 1st game in all treatments is filled. This gives the desired behavior in both settings, no? (i.e. if you have 200 games using only one treatment, it'll start the 2nd game immediately after the 1st fills; if you have 2 games using 4 unevenly sized treatments, it'll do the nice permutation behavior you described but then loop back around after all 4 treatments are filled.)

This is an interesting idea. I think we still need to bundle things in batches in the admin UI. That is, it is common that a user wants to have different treatment frequencies in different batches. For instance, if a game of a particular treatment failed (or didn't start in batch 1), then you want to oversample this treatment in batch 2.

Very soon, we will release a custom assignment API. It will look something like this:

Empirica.assign(player,batches){}

Where you'll be able to assign a player into any game within an active batche.

That said, we still need to provide the typical user with some commonly used defaults in the admin UI (e.g., simple, complete, blocked, rolling, etc.).

BTW, @hawkrobe, we are pretty active on Slack: https://join.slack.com/t/empirica-ly/shared_invite/zt-5y98b811-1ATG7hv5tnkZy~NU7IoGqg if you'd like to join us. These types of discussions are very useful right now as we are planning a complete rewrite of Empirica's core and we need to think of many of these use cases.

hawkrobe commented 3 years ago

Thanks for this thoughtful discussion! I like the 'duplicate batch' idea a lot as a quick & easy fix. In principle, that would give a nice refinement of the concepts surrounding the nesting of the randomization design. You have an 'inner loop' specifying simultaneous recruitment for some specified number of games/treatments within a batch and an 'outer loop' that repeats that structure sequentially some number of times. The Empirica.assign() seems like a great addition for more flexibility in the longer term!

I think; if there is no random assignment, then there is no experiment, no? (unless it is within-subject)

Yep, within-subject 😄 most of the studies we do are like this (or, sometimes within-network) for statistical power; participants are randomly assigned to conditions within each network, but every network has the same 'hyper-parameters' at the level of treatments.

fwiw, I think the inconvenience of manually spawning 200 batches is pretty significant even if there were 2 treatments (e.g. solo vs. dyadic) where you'd want to randomly assign the first few participants to the solo vs. dyad groups in a balanced way to let the dyad start, and then 'duplicate' this 'recruitment block' a bunch of times in the outer loop, rather than randomly assigning among the 200 dyadic games and 200 solo games all at the same time.

jcheong0428 commented 3 years ago

I second the need for a rolling assignment. Alternatively, a lobby configuration to designate the max # of players assigned to each game would also be helpful.

In use cases like @hawkrobe's, where you want to run 20 games each requiring 3 participants, it would be great to specify that each game is assigned a maximum of n number of players (max-player) into the pre-lobby state. If I set this variable to 3, I can make sure each game is assigned 3 players, before the next games are filled (essentially rolling assignment). But this feature would also allow you to increase max-player to 5 or 6, accounting for potential 1~2 player drop-offs. This way you can make sure only a certain number of people would end up in the "gameFull" state.

amaatouq commented 3 years ago

@jcheong0428 We over-assign players to games because some of these players might not finish the instructions, fail the quiz, leave their computers, etc. What we do once the game starts, the remaining over-assigned players will be reassigned to other games of the same treatment in the same batch. If there aren’t other available games in the same batch, they’ll be reassigned to games of the same treatment in the next batch. And so on.

Players will see "gameFull" ONLY if there are no more games of the same treatment in any of the active batches.

joshua-a-becker commented 3 years ago

sorry i missed this thread previously—i'm not super hooked into GitHub.

re: rolling assignment.. this is i think something a lot of experimenters want, but it's not optimal. IMHO, empirica defaults should reflect best practice.

it's feasible to use the current version with crowdsourcing platforms and get large groups, one just has to employ a certain strategy for recruitment. i have gotten 1000's of ppl on empirica simultaneously via MTurk.

hawkrobe commented 3 years ago

to be clear @joshua-a-becker, I don't think this thread is about people having problems getting large groups, or wanting to deviate from best practices for assigning to multiple conditions.

it's about the current clunkiness/unintuitiveness of the UI for best practices in one of the most common special cases: manipulations within small networks, where you want to run a large number of 3-4 person networks with identical treatment settings. currently you have to spam lots and lots of batches with exactly one game (i.e. click the 'duplicate' button 1000 times).