Closed GStechschulte closed 1 year ago
A demo is shown below:
import bambi as bmb
from bambi.plots import slopes, plot_slopes
data = bmb.load_data("mtcars")
model = bmb.Model("mpg ~ hp * wt + drat", data=data, family="gaussian")
idata = model.fit(draws=1000, chains=2)
Unit level slopes
slopes(
model,
idata,
wrt="hp",
conditional=None
).head(10)
term | estimate_type | value | drat | wt | estimate | lower_3.0% | upper_97.0% |
---|---|---|---|---|---|---|---|
hp | dydx | (110.0, 110.0001) | 3.90 | 2.620 | -0.046077 | -0.061826 | -0.027424 |
hp | dydx | (110.0, 110.0001) | 3.90 | 2.875 | -0.039438 | -0.055346 | -0.024399 |
hp | dydx | (93.0, 93.0001) | 3.85 | 2.320 | -0.053887 | -0.074023 | -0.033815 |
hp | dydx | (110.0, 110.0001) | 3.08 | 3.215 | -0.030586 | -0.045647 | -0.016320 |
hp | dydx | (175.0, 175.0001) | 3.15 | 3.440 | -0.024728 | -0.040884 | -0.009271 |
hp | dydx | (105.0, 105.0001) | 2.76 | 3.460 | -0.024207 | -0.040335 | -0.008568 |
hp | dydx | (245.0, 245.0001) | 3.21 | 3.570 | -0.021343 | -0.039777 | -0.006217 |
hp | dydx | (62.0, 62.0001) | 3.69 | 3.190 | -0.031237 | -0.046218 | -0.016832 |
hp | dydx | (95.0, 95.0001) | 3.92 | 3.150 | -0.032278 | -0.046864 | -0.017400 |
hp | dydx | (123.0, 123.0001) | 3.92 | 3.440 | -0.024728 | -0.040884 | -0.009271 |
Average slopes
slopes(
model,
idata,
wrt="hp",
conditional=None,
average_by=True
)
term | estimate_type | estimate | lower_3.0% | upper_97.0% |
---|---|---|---|---|
hp | dydx | -0.030527 | -0.051032 | -0.01007 |
Average by group slopes (and plot)
fig, ax = plot_slopes(
model,
idata,
wrt="hp",
conditional=None,
average_by="wt"
)
fig.set_size_inches(7, 3)
Condition on weight and plot elasticity
fig, ax = plot_slopes(
model,
idata,
wrt="hp",
conditional="wt",
slope="eyex"
)
fig.set_size_inches(7, 3)
User provided values
fig, ax = plot_slopes(
model,
idata,
wrt="hp",
conditional={"wt": [2, 3, 5], "drat": [3.5, 4, 4.5]},
)
fig.set_size_inches(7, 3)
Providing a quick review. I couldn't find big issues with the code, it's already almost done. Still, I'm requesting some changes (stylistic and minor updates). As always, fantastic work. Thanks a lot!
Edit It would be good to explain what is the
value
column in the returned data frame
Thanks a lot for the review. :) Much appreciated. In the latest commits I realised I forgot to explain what the value column is. I will add this as an inline comment and also explain it in the docs.
To Do:
slopes
only supports eyex
): eydx
and dyex
The latest commits added the following functionality:
eydx
, dyex
)wrt
and added tests for plot_slopes
. Below, I provide an example of added functionality (using the model and data from above):
User provided multiple values with wrt
. The value
column represents the values of the wrt
arg. used to compute the derivative. If the user passes multiple values, a small amount $\epsilon$ eps
is added to each value and then divided by that eps
to obtain the "instantaneous rate of change":
slopes(
model,
idata,
wrt={"hp": [150, 200, 250]},
conditional=["wt"]
)
term | estimate_type | value | wt | drat | estimate | lower_3.0% | upper_97.0% |
---|---|---|---|---|---|---|---|
hp | dydx | (150.0, 150.0001) | 1.513000 | 3.596563 | -0.074025 | -0.104541 | -0.041595 |
hp | dydx | (200.0, 200.0001) | 1.513000 | 3.596563 | -0.072006 | -0.102318 | -0.041779 |
hp | dydx | (250.0, 250.0001) | 1.513000 | 3.596563 | -0.069988 | -0.099388 | -0.040974 |
eydx
: unit increase in $x$ (wrt
) is associated with a % change in $y$:
plot_slopes(
model,
idata,
wrt={"hp": 150},
conditional=["wt"],
slope="eydx"
)
dyex
: % change in $x$ (wrt
) is associated with a unit increase in $y$:
fig, ax = plot_slopes(
model,
idata,
wrt={"hp": 150},
conditional=["wt"],
slope="dyex"
)
Merging #699 (541621b) into main (cbbf955) will increase coverage by
0.70%
. The diff coverage is90.00%
.
@@ Coverage Diff @@
## main #699 +/- ##
==========================================
+ Coverage 88.87% 89.58% +0.70%
==========================================
Files 43 43
Lines 3362 3523 +161
==========================================
+ Hits 2988 3156 +168
+ Misses 374 367 -7
Files Changed | Coverage Δ | |
---|---|---|
bambi/plots/plotting.py | 84.96% <87.01%> (-3.82%) |
:arrow_down: |
bambi/plots/effects.py | 89.24% <87.89%> (+11.08%) |
:arrow_up: |
bambi/plots/utils.py | 86.76% <93.22%> (+5.91%) |
:arrow_up: |
bambi/plots/__init__.py | 100.00% <100.00%> (ø) |
|
bambi/plots/create_data.py | 100.00% <100.00%> (+24.39%) |
:arrow_up: |
... and 1 file with indirect coverage changes
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Hi @GStechschulte! I'm requesting some changes. In terms of code, very minor things. But in general, I'm asking to add many docstrings.
Thanks for the review! In regard to docstrings, I feel I must explain myself. The reason for me not adding docstrings to non-public modules was because I was following the PEP 8 section on Documentation Strings:
Write docstrings for all public modules, functions, classes, and methods. Docstrings are not necessary for non-public methods, but you should have a comment that describes what the method does. This comment should appear after the "def" line.
Nonetheless, I recognise that there are docstrings for non-public modules in the Bambi codebase, and that they help in explainability. I will add them, it is no problem 😄
Hi @GStechschulte! I'm requesting some changes. In terms of code, very minor things. But in general, I'm asking to add many docstrings.
Thanks for the review! In regard to docstrings, I feel I must explain myself. The reason for me not adding docstrings to non-public modules was because I was following the PEP 8 section on Documentation Strings:
Write docstrings for all public modules, functions, classes, and methods. Docstrings are not necessary for non-public methods, but you should have a comment that describes what the method does. This comment should appear after the "def" line.
Nonetheless, I recognise that there are docstrings for non-public modules in the Bambi codebase, and that they help in explainability. I will add them, it is no problem 😄
I agree it's not necessary to add docstrings to everything, especially when it starts to become redundant. However, some internal documentation is also nice to help others understand how things work. Thanks for the flexibility in adding the requested docstrings :)
This draft PR introduces
slopes
andplot_slopes
. Slopes can be defined as the "partial derivative of the regression equation with respect to (wrt) a regressor of interest" (definition taken from marginaleffects).slopes
andplot_slopes
allow the user to view a summary dataframe and or plot the slope of the regressor of interest (wrt). This PR supports the following quantities of interest:Additionally, this PR also supports the interpretation of slopes as an elasticity (percent change in $x$ is associated with a percentage change in $y$). If the regressor of interest is categorical, computing an elasticity is not possible. For example, there is no % change in penguin species. Furthermore, if
wrt
is categorical when callingslopes
, thencomparisons
is called to compute the difference in contrast means.slopes
andcomparisons
are closely related. Thus, I have refactored the code such that these two functions can use the same code when building the summary dataframe and plotting. Lastly, I simplified the code to build the summary data frames.