[RFC] Tracking usage of causal context through causal algorithms

adam2392 commented 1 year ago

Is your feature request related to a problem? Please describe. Per many of pywhy's discord discussions, we are interested in longer-term tracking the causal assumptions across the causal inference pipeline: causal discovery, causal ID, causal estimation, and counterfactual analysis.

There are currently LOADS of "causal assumptions" that we can add. If we're not careful, we end up having feature creep and a really clunky Context and ContextBuilder object in dodiscover that has untested edge-cases. One of the ways to prevent this and track causal assumptions is to be more transparent about the usage of causal assumptions within the Context class. That is, if we had a "flag" that we could dynamically set within each algorithm that is passed "data" and "Context", to turn on if a specific piece of "Context" was accessed. This way, we can track usage of causal assumptions and along with the graph result, we also return a context result. For example:

context = make_context(...)
# perhaps a pretty print of causal assumptions
print(context.causal_assumptions())

learner = PC(FisherZCITest)
learner.fit(data, context)
graph = learner.graph_

# this is the context to interpret the learned graph
result_context = learner.result_context_

# BY DEFAULT, we do not carry over causal assumptions that were not used
# So this would be a pretty print of all the causal assumptions used
# If we did not leverage context metadata in the PC algorithm, then this will be pretty empty
print(result_context.causal_assumptions())

Describe the solution you'd like We need to add a result_context_ to all causal discovery algorithms.

We also need to add a way of tracking calls to Context (possible with meta class ). We essentially want a class decorator that allows us to track when each function is called. So something like:

@decorate_context
class Context:
     # we want a private flag that lets us determine when to "start tracking usage of context" because someone
     # might build a context and inspect it in their Jupyter notebook, so we don't want to track that
     _track_flag: bool=False

class PC:
...
       # in the PC algorithm, when context object properties are accessed, we turn on `_track_flag = True`
def fit(self, data, context):
       context._track_flag = True
       init_graph = context.init_graph
       fixed_edges = context.fixed_edges
       self.learn_skeleton(init_graph, fixed_edges, ...)

Now, when we finish running PC algorithm, we can inspect result_context_ and see that init_graph and fixed_edges were "causal assumptions" that were used in the construction of the learned graph.

Having all causal assumptions "bottlenecked" through a Context like object may also be beneficial in helping "choose" the optimal learners and/or CI tests. E.g. if we add Context.linearity = True, meaning we are learning a linear SCM graph, and Context.noise == <anything non gaussian>, we might be able to offer a default to the LinGAM learner.

The con is that this Context like object will sort of be a melting pot of different causal assumptions that may or may not be ever used in many learners (e.g. PC algorithm will probably not make use of the noise assumption).

Things to be cautious about This does not circumvent the problem of "how to use" the causal assumptions within each algorithm. It simply tracks whether or not the algorithm uses it or not. For example, there is no known complete algorithm for learning a valid Markov equivalence class under general background knowledge of fixed edges.

Describe alternatives you've considered Alternatively each class has a set of causal assumptions listed that are fed into a "result context-like" object. These don't necessarily need to have structure to them and then is more transparent to a developer which causal assumptions are used in any learner class. This also has the advantage that it is explicitly spelled out and doesn't overload the Context object with a bunch of properties that may not be ever used by other learners.

Additional context I am currently adding the Psi-FCI algorithm to dodiscover, and find the Context object very clunky to use, so am playing around w/ small design changes. I started thinking of a larger design change overall that also tackles some of the problems we've been discussing.

xref other issues: #104 #62 #70

I believe this can be generalized to dowhy identification and estimation procedures, but am not as familiar w/ the internals there. cc: @amit-sharma @robertness @emrekiciman @bloebp @petergtz @jaron-lee @darthtrevino for comments.

jaron-lee commented 1 year ago

+1 for the tracking idea, that seems very useful. I think having one place to put all the causal assumptions is reasonable and would help an analyst keep track of what they are assuming in a particular problem.

Another con regarding the "melting pot" of assumptions is that it would be quite difficult to know in general if all of the causal assumptions specified are even compatible or consistent with one another. I guess this would fall on the analyst to ensure that they have a valid assumption set.

emrekiciman commented 1 year ago

Tracking usage is a nice idea, @adam2392 . I like it.

Does it make sense to make the Context class less structured and give a common interface to every assumption? I'm not sure how to explain it without code or pseudocode.... excuse my pseudocode syntax ... What do you think of:

BaseAssumption {
   AssertedOrConjectured // some assumptions might be asserted up front by the human, others might be conjectured by an analysis procedure and require attention and testing later.
   IsUsed;  // was used in the current analysis
   TestResults;  // we might have results of 1 or more tests (against data, other sources of experimental knowledge; etc
   SensitivityAnalysisResults // we might have run sensitivity analyses.
}

and then every assumption (about fixed edges, about linearity, monotonicity, additivity, etc subclasses Assumption

LinearityAssumption(BaseAssumption) {
    IsLinear
}

Context {
   DictionaryOfAllAssumptions

   ...
}

Then the Context class becomes a dictionary or bag of these assumptions (or a wrapper around a dictionary to help in bookkeeping). But as we add new algorithms that make new assumptions, we can co-locate the code for the algorithm, it's particular assumptions, tests of the assumption, etc., together in code, without having to modify the Context class and all of the code that uses it.

@jaron-lee could point about consistency. Maybe we can add a consistency test function to each assumption that inspects the rest of the context for contradictions. Most assumptions would probably start off with an empty test, but if we see people misusing them, we will have place to add code to catch those common errors, at least.

Also, if assumptions aren't used by a algorithm, we might be able to use some of them as validation checks afterwards. For example, if the user asserts a set of fixed edge constraints, and a particular CD algorithm ignores them, then we can have a refutation that looks at the fixed edges and asserts that the CD algorithm actually found them and warn if that assumption/assertion is being violated.

adam2392 commented 1 year ago

Another con regarding the "melting pot" of assumptions is that it would be quite difficult to know in general if all of the causal assumptions specified are even compatible or consistent with one another. I guess this would fall on the analyst to ensure that they have a valid assumption set.

Agreed. I suppose if we clearly update documentation while also adding tests between common assumptions, along with a user-friendly API (i.e. being able to pretty-print all the causal assumptions for easy inspection) is as good as we can get.

@jaron-lee could point about consistency. Maybe we can add a consistency test function to each assumption that inspects the rest of the context for contradictions. Most assumptions would probably start off with an empty test, but if we see people misusing them, we will have place to add code to catch those common errors, at least.

True, I guess there is no free-lunch here. We have to lose out on something. I think transparency and clear documentation is key here if we go this route. Perhaps we can discuss this on Monday if the others get a chance to see this.

Also, if assumptions aren't used by a algorithm, we might be able to use some of them as validation checks afterwards. For example, if the user asserts a set of fixed edge constraints, and a particular CD algorithm ignores them, then we can have a refutation that looks at the fixed edges and asserts that the CD algorithm actually found them and warn if that assumption/assertion is being violated.

+1

py-why / dodiscover

[RFC] Tracking usage of causal context through causal algorithms #112