pymc-devs / pymc-examples

Examples of PyMC models, including a library of Jupyter notebooks.
https://www.pymc.io/projects/examples/en/latest/
MIT License
269 stars 234 forks source link

Unable to Replicate the sampling speed given in the example #466

Open soumyasahu opened 1 year ago

soumyasahu commented 1 year ago

Hi,

I am very new to pymc. I have mostly used rstan for the bayesian computations required for my research as it is mostly popular for the implementation of the No-U-Turn sampler (NUTS), a popular extension of HMC. Recently we are developing a model which involves Dirichlet Process Mixture (DPM) priors. Implementation of DPM is tricky in stan as it is unable to sample from the discrete parameters. The problem can be solved by marginalizing the likelihood which requires evaluation of the likelihood for each discrete value of the parameters. This makes the sampler very slow.

Recently, I came across pymc4 and found that it can generate from discrete parameters even if they are using NUTS. To understand the usages, I found the following page- https://www.pymc.io/blog/v4_announcement.html

I tried to replicate the examples in the recently installed Anaconda 3 with the jupyter notebook interface. Unfortunately, it is taking 3 hours to run 8000 samples using pymc3 whereas it is supposed to run in 23 seconds as shown on the webpage. Same thing happened using pymc4.

Specs for yy laptop are: Windows 11, 8 cores, ram 16 GB

Please let me know what I should do to replicate the example correctly.

OriolAbril commented 1 year ago

Which example are you trying to replicate? Maybe it is one of the jax sampling ones? Also, do you see any warning when importing pymc? I'd guess there is some kind of installation issue but I don't really know

soumyasahu commented 1 year ago

Hi @OriolAbril,

You are correct, I have faced some installation issues. First, let me clarify the code I tried:

def build_model(pm):
    with pm.Model(coords=coords) as hierarchical_model:
        # Intercepts, non-centered
        mu_a = pm.Normal("mu_a", mu=0.0, sigma=10)
        sigma_a = pm.HalfNormal("sigma_a", 1.0)
        a = pm.Normal("a", dims="county") * sigma_a + mu_a

        # Slopes, non-centered
        mu_b = pm.Normal("mu_b", mu=0.0, sigma=2.)
        sigma_b = pm.HalfNormal("sigma_b", 1.0)
        b = pm.Normal("b", dims="county") * sigma_b + mu_b

        eps = pm.HalfNormal("eps", 1.5)

        radon_est = a[county_idx] + b[county_idx] * data.floor.values

        radon_like = pm.Normal(
            "radon_like", mu=radon_est, sigma=eps, observed=data.log_radon, 
            dims="obs_id"
        )

    return hierarchical_model

I tried the followings:

model_pymc3 = build_model(pm3)
%%time
with model_pymc3:
    idata_pymc3 = pm3.sample(target_accept=0.9, return_inferencedata=True)

and

model_pymc4 = build_model(pm)
%%time
with model_pymc4:
    idata_pymc4 = pm.sample(target_accept=0.9)

Now, regarding the installation problem, For installing pymc4, I used conda create -c conda-forge -n pymc_env "pymc>=4". This resumes the installation and the following file was created: 4.txt

The file looks interactive but I don't know any way to interact with this and the installation was stuck forever.

I installed pymc4 by using the command pip install "pymc>=4" and that seemed to work fine as it didn't pop up any error message.

For installing pymc3 I used pip install pymc3 as mentioned in the installation guide. It gave me the following error but it did not show any problem while importing the package. error_pymc3

If you kindly tell me if the packages are wrongly installed or guide me to solve the slow sampling issue, it will be very helpful.

ricardoV94 commented 1 year ago

@soumyasahu please open an issue in our discourse at https://discourse.pymc.io/

We try to keep GitHub only for bugs and development issues.