Metaculus / metaculus

BSD 2-Clause "Simplified" License
50 stars 11 forks source link

Implement range questions with discrete values #1233

Open malginin opened 1 day ago

malginin commented 1 day ago

For many range type questions, the set of possible outcome values is discrete. Current examples: https://www.metaculus.com/questions/28845/how-many-of-the-q3-2024-top-16-will-be-in-top-16-in-q4-2024/ https://www.metaculus.com/questions/29028/us-state-dept-global-arms-sales-approved-q4-2024/ (Here, the outcomes are integers but often the resolution source contains numbers rounded to 1 or 2 decimal points, which also means the set of outcomes is discrete).

With the new ability to add an unlimited number of components, to increase their scores, the more experienced forecasters now create predictions with dozens of narrow components centered on the possible discrete values. This requires an expensive time investment and disadvantages the forecasters who are less familiar with the Metaculus interface and scoring system.

Desired behavior: For the questions with discrete outcomes, use a probability mass function instead of a probability density function and have each component represent a binomial distribution instead of a normal one.

lsabor commented 15 hours ago

Interesting idea. I think an alternative here would be to add a setting on each Question that defines the number of bins to consider. For questions with only discrete outcomes (or let's just say evenly spaced expected outcomes that are less than 30 or so in number), the question writer would determine exactly how many "buckets" to have within range.

To let you in on 2 secrets (not actually secrets, just little known facts):

  1. the "pdf" is actually a "pmf" with 200 values (plus 2 additional values for below and above bounds). For questions with a medium number of possible outcomes (too many for a Multiple Choice question), we could just lower that number.
  2. the API actually allows you to predict directly with a 201 point cdf, so you can push the boundaries even further when it comes to making an extremely precise forecast. This is definitely an advantage for those savvy enough to use the API, and so we intend to make it a little easier to understand and eventually add more ways to forecast on continuous questions. We actually have just started supporting forecasting with percentiles to support continuous questions for the AI benchmarking tournament, but it's not yet documented.
lsabor commented 12 hours ago

Assigning Sylvain to this just for visibility so we can discuss the possibility.