rgiordan / zaminfluence

Tools in R for computing and using Z-estimator approximate influence functions.
Apache License 2.0
95 stars 10 forks source link

Add output example interpretation #27

Open akarlinsky opened 2 years ago

akarlinsky commented 2 years ago

The arXiv paper shows the results of zaman analysis for several papers, usually showing the number and share of observations dropped in the form of a table.

The example in the README just shows some plots, which I'm not really sure how to interpret. Adding an "annotated output" for the example via table and/or explaining what the plots show would be really great IMO.

rgiordan commented 2 years ago

This is a great suggestion. I'll work on putting this together when I get a chance.

maswiebe commented 2 years ago

+1, I'm having trouble understanding the output in the readme example.

rgiordan commented 2 years ago

I put another example file, examples/interpreting_output.R, in https://github.com/rgiordan/zaminfluence/pull/37. I'd be interested to hear if it's helpful.

Perhaps a standalone function to produce tables similar to those in the paper would still be helpful.

maswiebe commented 2 years ago

Nice, this was helpful! A standalone function would also be helpful, since I would be making one myself.

Note a small typo on line 70: 'for exmample'.

Also, the intuition for "large residuals and large |x1|" is high influence via large residuals and high leverage of x1, right? Might be helpful to make that explicit.

# For example, you can graph the reruns and predictions versus one another like so:
ggplot(summary_df) +
  geom_point(aes(x=prediction, y=rerun, color=param_name, shape=metric)) +
  geom_abline(aes(slope=1, intercept=0))

This graph isn't very clear, because the scale is so different for x1 and x2.