Open OriolAbril opened 3 years ago
Hi! I'd like to try working on this.
That would be great @jessicakzhang! I have added a couple of suggestions based on a quick look over the notebook, I'll review more carefully once you submit a PR.
Let us know if you have any doubt while working on this
Hi @jessicakzhang , are you still working on this issue? @OriolAbril would it be okay if I were to submit a PR for this issue, considering the fact that this issue has already been assigned?
Hi, yes, as it has been more that two weeks with no activity, as indicated in the contributing guide, I'll assign the issue to you so you can submit a PR.
thanks a lot!
@OriolAbril had a doubt regarding updating np.exp(np.mean())
to np.mean(np.exp())
in cells 15 and 19. as far as I understood, az.summary
returns certain statistics of which mean is one, and we are computing np.exp()
for this summary dataframe.
should this be changed to compute np.mean()
of az.summary
? I am unclear as to exactly where we should be applying np.exp()
here.
also, while updating az.summary
to use args rather than manually subsetting the dataframe, should we show all the default stats (mean, sd, hdi_3%, hdi_97%) for the variables or only a subset of mean, hdi_3%, hdi_97% as is being shown currently?
I used mean as a placeholder, summary
is acting as mean (as well as acting as hdi). Exponentiating should come first, then calling summary on the exponentiated data, not the other way around.
I think the default stats is good enough and it's simple, there is no need to overly complicate the notebook only to exclude sd
from summary.
oh okay, thanks for clarifying.
so when I do the exponentiation on inf_fish
which is an InferenceData
object, I should first convert it to a data frame and exponentiate that data frame, and only then create a summary for it. have I understood this correctly? but az.summary
takes an InferenceData
object, so what would be the best way to exponentiate inf_fish
here?
You should exponentiate the posterior samples, which are a group in inferencedata, in the form of an xarray dataset. It should look something like: az.summary(np.exp(idata.posterior), ...)
yes, that works, thanks a lot!
in cell 19, I think there's a typing error. the code in cells 15 and 19 are identical and call a summary for the same variable inf_fish
, which was generated using the manual model. cell 19 should be displaying the summary for the model results created with glm.from_formula
and which are stored in the variable inf_fish_alt
. can you confirm this?
also, should the data in the markdown cell after cell 15 be altered to match the new mean and hdi that we see in the summary? the values are only very slightly off
in cell 19, I think there's a typing error. the code in cells 15 and 19 are identical and call a summary for the same variable inf_fish, which was generated using the manual model. cell 19 should be displaying the summary for the model results created with glm.from_formula and which are stored in the variable inf_fish_alt. can you confirm this?
yes, it is definitely a typo.
also, should the data in the markdown cell after cell 15 be altered to match the new mean and hdi that we see in the summary?
I would update it to avoid confusing readers
Needs to be updated to use bambi instead of glm module
will be working on it @OriolAbril
I'm about to update this to v4
File: https://github.com/pymc-devs/pymc-examples/blob/main/examples/generalized_linear_models/GLM-poisson-regression.ipynb Reviewers:
Known changes needed
Changes listed in this section should all be done at some point in order to get this notebook to a "Best Practices" state. However, these are probably not enough! Make sure to thoroughly review the notebook and search for other updates.
General updates
np.exp(np.mean())
instead ofnp.mean(np.exp())
.ArviZ related
kind="stats"
or customize summary, examples of both at: https://arviz-devs.github.io/arviz/api/generated/arviz.summary.htmlNotes
Exotic dependencies
None
Computing requirements
Models sample in less than a minute