pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.69k stars 2.01k forks source link

DOC: Typo in Introductory Overview of PyMC? #6860

Closed brendan-m-murphy closed 1 year ago

brendan-m-murphy commented 1 year ago

Issue with current documentation:

This line towards the end of the "Model Specification" section in the overview tutorial is a bit confusing:

If we wanted to use the slice sampling algorithm to sigma instead of NUTS (which was assigned automatically), we could have specified this as the step argument for sample.

Is this suppose to say "to sample" instead of "to sigma"?

Or maybe, is the example meant to show slice sampling for sigma only and NUTS for the other parameters? The example just uses step = pm.Slice(), so nothing special is happening to sigma here.

Idea or request for content:

Could someone who knows what was intended to be said clarify this sentence?

Also, it could be nice to see an example where Slice is used on sigma and NUTS is used on the other parameters. But if the example is left as is, maybe remove the reference sigma in the quoted sentence.

welcome[bot] commented 1 year ago

Welcome Banner :tada: Welcome to PyMC! :tada: We're really excited to have your input into the project! :sparkling_heart:
If you haven't done so already, please make sure you check out our Contributing Guidelines and Code of Conduct.

fonnesbeck commented 1 year ago

That's a typo, yes. Feel free to submit a PR if you have the time.

brendan-m-murphy commented 1 year ago

Sure, I should be able to get to it soon.

There are a few mismatches between the text and the code for the first case study: InverseGamma(1, 1) in the text but the parameters in the code are (1, 0.1), and for the local shrinkage prior the text says HalfStudentT_5, but the distribution in the code as 2 degrees of freedom.

Should I make the text match the code? This blog post has similar code and uses 5 degrees of freedom for the second HalfStudentT and (1, 1) for the inverse gamma: https://austinrochford.com/posts/2021-05-29-horseshoe-pymc3.html

Also sigma is mentioned after the formula for \beta_i but it doesn't appear in the formula. I would move this to after the definition of the prior for \tau.

thompsonjjet23 commented 1 year ago

@brendan-m-murphy I think this PR should fix the typo?

@fonnesbeck what is the review process for contributing to open source projects? I'm new here but want to be useful if I can :)

larryshamalama commented 1 year ago

Closed by #6925. Thanks @thompsonjjet23 for your help :)

brendan-m-murphy commented 1 year ago

Sorry I didn't see this was closed until today, but the PR didn't fix the typos regarding the horseshoe prior example.

thompsonjjet23 commented 1 year ago

ah my apologies @brendan-m-murphy ! i went back and looked in the pymc_overview.ipynb but I couldn't find anything when I searched for "horseshoe"

mind being a bit more specific about where the typo is that is still there?

i'm happy to fix it i just am not sure where to find it -- I opened this PR to fix the typo in the spelling of horseshoe , is that what you mean?

https://github.com/pymc-devs/pymc/pull/6928

brendan-m-murphy commented 1 year ago

Hi @thompsonjjet23, no worries, thanks for working on this.

The model in case study 1 is referred to as a "hierarchical regularized horseshoe" (this is the first place I've seen it). In the paragraph before the model specification section of case study 1, it mentions a Half-Student T distribution with subscript 5 and an inverse gamma distribution with parameters (1, 1). But in the model in the following section, 2 is used instead of 5 in the Half-Student T distribution, and the parameters (1, 0.1) are used in the inverse Gamma distribution.

The the blog post in the link in my comment above used 5 instead of 2, and (1, 1) instead of (1, 0.1). If we want to follow what is done in that blog post, then we should change the model parameters, or if we want to keep the model parameters, we should have "HalfStudentT_2" and "InverseGamma(1, 0.1)" in the text.

This is the part of the model specification for case study 1 that doesn't match the text at the end of the "The Model" section:

    # Local shrinkage prior
    lam = pm.HalfStudentT("lam", 2, dims="predictors")
    c2 = pm.InverseGamma("c2", 1, 0.1)
thompsonjjet23 commented 1 year ago

Ah I think I see! Thanks for clarifying @brendan-m-murphy

Is this code change in line with what you're describing?

@larryshamalama let me know if instead I should be re-opening the old pull request (instead of opening a new one like I did here)

brendan-m-murphy commented 1 year ago

@thompsonjjet23 yes that looks good, thanks!