zenna / CausalDiscovery.jl

9 stars 4 forks source link

Design Document for CISC #6

Closed zenna closed 4 years ago

zenna commented 4 years ago

Design document for CISC Design

In short: This is a corpus of interactive programs designed to evaluate a systems ability to construct complex causal models. Instances in this corpus have complex functional and causal relationships.

Comparison to Existing Benchmarks

Overview

Set of n_train training problem and m_test testing problems.

image

Queries The agent interacts with the world by performing imperatives Imperatives are expressions in a language. An example of an imperative might be press the red button.

Open questions:

Experiments vs Actions

What are the important distinguishing factors between an experiment and the actions an experimenter can take. For instance, if we want to learn the workings of the blicket machine, an experiment is something like "put the red square on the box". Whether this corresponds to an action or not depends on the action space. Suppose the action space is moving a mouse cursor, and clicking. Then, there is a significant difference between the actions you can take, and the experiment.

In this setting, an experiment seems more like a goal state. It's tempting to suggest it's a set of possible worlds, but then we might lose the distinction between an intervention/action and an observation. For instance, suppose we want to distinguish whether smoking -> cancer or cancer -> smoking. An experiment might be something like force someone to smoke.

Language

Should there be a DSL for these models? You could imagine a DSL for instance that specified the number of objects in the scene and some dynamics. If you think of a game engine there are some things that are just given, like physics, objects and object groups, rendering.

It could be something as simple as

Evaluation

There are a few approaches to consider: Structural similarity. In Bayesian networks this some motion of graph similarity. It is tenuous there and even more tenuous in the case of programs. There are many syntactically different programs that induce the same causal model

Predictive Accuracy: How well the induced model predicts data

Counterfactual Accuracy: How well the induce model predicts counterfactual / interventional distributions