Watts-Lab / commonsense-platform

Commonsense platform
https://commonsense.seas.upenn.edu
1 stars 0 forks source link

Evaluating if statement properties are reasonable design points #65

Closed markwhiting closed 11 months ago

markwhiting commented 1 year ago

Lets test:

  1. a single set of 15 statements for 50 people
  2. a single set of 15 statements at the design point: e.g.,[0,1,0,0,0,0], for 50 people
  3. a single set of 15 statements at the design point: e.g.,[1,0,0,1,1,1], for 50 people

In this case, design points indicate these values: behavior, everyday, figure_of_speech, judgment, opinion, reasoning

An experiment has a file that exports a manifest which has some properties that fully define* the experiment. The experiment manifest file could be identified with a URL property.

manifest = {
  treatments: [
    {statements: ... (list of statements, or callback),
     randomization: ... (none, fully random, callback),
     ...
    },
    {...},
    {...},
    ], // treatments could be a callback if we are exploring a large design space 
  assignment: [random, round robin (1, then 2, then 3, then 1,...), callback()]// how are people assigned to a treatment?
}

Randomly treat new visitors to one of these three options and take note of which one they received.

Might be worth reading this to think about the manifest specification → https://docs.empirica.ly/overview/concepts (and perhaps other parts of the docs)

amirrr commented 1 year ago

When ever a new user requests a statement manifest file is looked up to choose a method to serve the statements for the user. Maybe each treatment has a unique id so the frontend can keep track of what to do next.

// callbacks: length is considered when using the callback?
getFromAllstatements()
getFromListOfStatements([....])
getFromCategories(object of categories)
module.exports = {
  treatments: [
    {
      id: 1,
      length: 15,
      statements: [5, 12, 14, 15, 20], // (list of statements, or callback)
      randomization: "none", // (none, fully random, callback)
      subjects: 50,
    },
    {
      id: 2,
      length: 15,
      statements: getAllStatements([5, 12, 14, 15, 20]),
      randomization: "random",
      subjects: 50,
    },
    {
      id: 3,
      length: 10,
      statements: getDesignSpace({
        // design space parameters
        behavior: false,
        everyday: true,
        figure_of_speech: false,
        judgment: false,
        opinion: false,
        reasoning: false,
      }),
      randomization: "none",
      subjects: 20,
    },
  ],

  // how are people assigned to a treatment?
  assignment: "random", // [random, round robin (1, then 2, then 3, then 1,...), callback()]

};
markwhiting commented 12 months ago

Some brainstorming on how to design treatments

// const treatments: [design_space_experiment, sampling_count_experiment] 

typeof treatments[0].length == Number

const treatments: [
  {function: design_space_experiment, parameters: {length: 10}}, 
  {function: design_space_experiment, parameters: {length: 20}}, 
  {function: design_space_experiment, parameters: {length: 30}}, 
  sampling_count_experiment
] 

design_space_experiment 
  select a random design point from [...]
  return generator that leverages that design point

const generator_prototype = (design_space_goal, user_id) => {
  yield statements.filter(statement => statement.design_space_position == design_space_goal).pick()
}

const function1 = (user_id) => generator_prototype([1,2,3,4], user_id)
const function2 = (user_id) => generator_prototype([1,2,2,4], user_id)

const random_statement = (length) => i <= length ? yield statements.pick() : return done''

const fifteen_random_statements = random_statement(15)

const weighted_random_statement = () => ... 

const sampling_count_experiment = (min, max) => {
      return {
          treatment_index: 1, 
          treatment_generator: random_statement(Math.round(Math.random() * 100))
      }
}

some generator → {statement: "2 + 2 = 4", interface_params: {language: "Korean", background_color: "red"}}

some generator → {statement: "2 + 2 = 5"

What we might need for the current experiment:

// a single set of 15 statements for 50 people
const random = generator_prototype_for_15(...)

// a single set of 15 statements at the design point: e.g.,[0,1,0,0,0,0], for 50 people
const pont1 = generator_prototype_for_15([0,1,0,0,0,0])

//a single set of 15 statements at the design point: e.g.,[1,0,0,1,1,1], for 50 people
const point2 = generator_prototype_for_15([1,0,0,1,1,1])

const treatments = [random, point1, point2]
const assignment = round_robin

generator_prototype = (user_id,n) => {return some_generator_function (n) => {return {statement:statement, user: user_id}}}

Sample of generator use and deleting (deleting might not be working correctly)

const express = require('express');
const app = express();

// Store for the generator functions
const userGenerators = {};

// Simple generator function creator
function* createGenerator(userId) {
  let count = 0;
  while (count <= 5) {
    yield `${userId}'s count is ${count++}`;
  }
}

// Route to create a new generator for a user
app.get('/:userId', function (req, res) {
  const userId = req.params.userId;
  if (userGenerators[userId]) {
    // const value = createGenerator(userId).next().value;
    res.send(userGenerators[userId].next())
} else {
    userGenerators[userId] = createGenerator(userId);
    res.send(`Generator created for user ${userId}`);
  }
});

app.get('', function (req, res) {
    res.send(Object.entries(userGenerators).filter(generator => generator[1].next().done))
    Object.entries(userGenerators).filter(generator => generator[1].next().done).forEach(generator => delete userGenerators[generator[0]])
    // res.send(JSON.stringify(userGenerators,2))
  });

app.listen(3000, function () {
  console.log('App listening on port 3000!');
});
markwhiting commented 11 months ago

We ran a pilot of this and show the results below:

untitled-10

In short, it looks like we are getting reasonably strong differences with statement properties as design points, so it seems like this is a reasonable path forward.

Closing now and we can further consider details of how to set up the larger experiment with this.