Chapter 19 - Githubissues

brunaw commented 2 years ago

In Chapter 19 - Evaluating Causal Models, you propose a 'random model' that samples from a ~Unif(0, 1) distribution, and that doesn't match the context of the problem (how can you compare something in [0,1] with something on a completely different scale?). It would probably be clearer if you sampled from a ~Unif(min(sales), max(sales)), e.g.:

np.random.seed(123)
prices_rnd_pred = prices_rnd.assign(**{
    "elast_m_pred": predict_elast(m1, prices_rnd),               ## elasticity model
    "pred_m_pred": m1.predict(prices_rnd[X]),                     ## predictive model
    "rand_m_pred": np.random.uniform(low = min(prices_rnd["sales"]), high = max(prices_rnd["sales"]), 
        size=prices_rnd.shape[0]),                                             ## random model 
})

because that would be on the same scale as the data. Does that make sense?

Jayzhaowj commented 2 years ago

Following up this issue, would that be a typo in the book {"pred_m_pred": m2.predict(prices_rnd[X])} instead of m1.predict(prices_rnd[X])? Thank you.

matheusfacure commented 2 years ago

Hi, @brunaw. The evaluation here is invariant to scale. It order the observations by the score and builds the evaluation on such ordering, not on the raw score.

matheusfacure commented 2 years ago

@Jayzhaowj , you are correct. It was a typo. Thanks!

matheusfacure / python-causality-handbook

Chapter 19 #234