Open akarlinsky opened 2 years ago
This is a great suggestion. I'll work on putting this together when I get a chance.
+1, I'm having trouble understanding the output in the readme example.
I put another example file, examples/interpreting_output.R
, in https://github.com/rgiordan/zaminfluence/pull/37. I'd be interested to hear if it's helpful.
Perhaps a standalone function to produce tables similar to those in the paper would still be helpful.
Nice, this was helpful! A standalone function would also be helpful, since I would be making one myself.
Note a small typo on line 70: 'for exmample'.
Also, the intuition for "large residuals and large |x1|" is high influence via large residuals and high leverage of x1, right? Might be helpful to make that explicit.
# For example, you can graph the reruns and predictions versus one another like so:
ggplot(summary_df) +
geom_point(aes(x=prediction, y=rerun, color=param_name, shape=metric)) +
geom_abline(aes(slope=1, intercept=0))
This graph isn't very clear, because the scale is so different for x1 and x2.
The arXiv paper shows the results of zaman analysis for several papers, usually showing the number and share of observations dropped in the form of a table.
The example in the README just shows some plots, which I'm not really sure how to interpret. Adding an "annotated output" for the example via table and/or explaining what the plots show would be really great IMO.