vincentarelbundock commented 1 year ago

Step 1: `build_plot()` function

[x] function accepts two arguments: model and condition
[ ] Use get_modeldata() to extract the original data used to fit the model. Call it modeldata
[x] Use assert to make sure that the condition argument conforms to required types, and return a helpful message if it doesn't.
- [x] List of strings of length 1 to 3 where each string is a column name in modeldata.columns
- [ ] Dict of length between 1 and 3 where each key is a column name in modeldata.columns
[x] If condition is a list of strings, then use utils/get_variable_type() to determine what type is the variable:
- Numeric:
- If the first variable in condition is numeric, take 100 equally-spaced points between the min and max
- If the 2nd or 3rd variables in condition are numeric, take Tukey's 5-numbers: https://en.wikipedia.org/wiki/Five-number_summary
- Boolean, Character:
- Take all unique values (ex: modeldata["variablename"].unique()). If there are more than 10 unique values, use assert to return an informative error to say that it is not supported.
[x] If condition is a dict, then we take the values supplied by the user explicitly instead of using our own summaries.
[x] Plug the values you just extracted for our 1, 2, or 3 variables and plug them in the datagrid() function to create a data frame.

Step 2: Pass `build_plot()` output to `predictions()`

[ ] Pass the model and the data frame you created to the newdata argument in the predictions function. In principle, this should give you a nice data frame of predictions.

Step 3: Pass the result to `seaborn` or `matplotlib`

[ ] Discuss with me which makes sense. Is there another package beyond those two?
[ ] Use the data frame of predictions to plot the results
- [ ] The estimate column of the data frame in Step 2 is the Y-axis
- [ ] Variable 1 in condition determines the variable on the X-axis
- [ ] Variable 2 plots different predictions with lines with different colors
- [ ] Variable 3 splits the plot in different facets (multiple small plots)
[ ] Numeric x-axis: Draw lines and ribbons
[ ] Character or boolean x-axis: Draw dots and ranges
[ ] Add a function called find_response() to the utils.py file. This extracts the name of the dependent variable as a string. I think that in statsmodels this is stored in model.exog_name or some similar attribute.
[ ] Add nice labels to the y and x-axes using the dependent variable name and the X-axis variable

Step 4: `newdata` and `by` arguments

This is an alternative to condition. These arguments cannot be used at the same time, and we need to use assert to raise informative errors if the user tries to do it.

Read the marginaleffects for R documentation and vignette and code to figure this out. Should be pretty straightforward, but ask me if you can't figure it out after 30 minutes of work.

Step 5: Add the other arguments.

This is mainly a question of passing additional arguments to the predictions() call we used in Step 2.

Step 6: Repeat for `plot_slopes()` and `plot_comparisons()`.

The challenge here will be "Don't repeat yourself". How can we make the much of the code reusable for all three types of plots.

The best way to approach this is to do it for plot_predictions(). Then we can refactor the code to make it work for all three.

vincentarelbundock commented 1 year ago

@LamAdr Do you need any clarification?

LamAdr commented 1 year ago

Hi @vincentarelbundock, I have a first attempt at step 1. I am not sure of the workflow. Should I push my local branch? I seem to lack the permission for that.

vincentarelbundock commented 1 year ago

Very nice!

Normally, you need to fork the repo (a fork is an independent copy of the repo under your own account), push your local branch to your own fork, and then open a "Pull Request". The Github docs are pretty good, I think:

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/about-forks

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork

Chat-GPT will almost surely give you terminal commands that are very close to what you need for this. Super common operations...

vincentarelbundock / pymarginaleffects

Plot #24

Step 1: `build_plot()` function

Step 2: Pass `build_plot()` output to `predictions()`

Step 3: Pass the result to `seaborn` or `matplotlib`

Step 4: `newdata` and `by` arguments

Step 5: Add the other arguments.

Step 6: Repeat for `plot_slopes()` and `plot_comparisons()`.

vincentarelbundock / pymarginaleffects

Plot #24

Step 1: build_plot() function

Step 2: Pass build_plot() output to predictions()

Step 3: Pass the result to seaborn or matplotlib

Step 4: newdata and by arguments

Step 5: Add the other arguments.

Step 6: Repeat for plot_slopes() and plot_comparisons().

Step 1: `build_plot()` function

Step 2: Pass `build_plot()` output to `predictions()`

Step 3: Pass the result to `seaborn` or `matplotlib`

Step 4: `newdata` and `by` arguments

Step 6: Repeat for `plot_slopes()` and `plot_comparisons()`.