BayesianModelingandComputationInPython / BookCode_Edition1

https://www.bayesiancomputationbook.com
GNU General Public License v2.0
502 stars 129 forks source link

Issue on page /notebooks/chp_01.html #191

Open I-Mohiuddin opened 2 years ago

I-Mohiuddin commented 2 years ago

Code is broken for listing 1.6: predictive_distributions `pred_dists = (pm.sample_prior_predictive(1000, model)["y_obs"],

          pm.sample_posterior_predictive(idata, 1000, model)["y_obs"])`

Neither the first or second part of the tuple function properly.

for the first part of the tuple the error I received was

"'AttributeError: 'InferenceData' object has no attribute 'y_obs' ". This was simple enough to correct and it worked after I changed the line to pm.sample_prior_predictive(1000, model).prior_predictive["y_obs"]

For the second part of the tuple there were several errors: The initial error I received was: "IncorrectArgumentsError: Should not specify both keep_size and samples arguments. See the docstring of the samples argument for more details. "

I looked online and then added the argument keep_size = False to my code: pm.sample_posterior_predictive(idata, 1000, model, keep_size=False)["y_obs"]

However this now generates the following error:

"ValueError: conflicting sizes for dimension 'chain': length 1 on the data but length 2 on coordinate 'chain' "

I'm sure there is a simple fix and don't think it will stop me continuing to use this book, but these errors definitely break the immersion with the learning process. Hopefully this can be resolved quickly.

canyon289 commented 2 years ago

apologies on the issue. are you using the code versions from the environment file?

On Saturday, October 29, 2022, I-Mohiuddin @.***> wrote:

Code is broken for listing 1.6: predictive_distributions `pred_dists = (pm.sample_prior_predictive(1000, model)["y_obs"],

      pm.sample_posterior_predictive(idata, 1000, model)["y_obs"])`

Neither the first or second part of the tuple function properly.

for the first part of the tuple the error I received was

"'AttributeError: 'InferenceData' object has no attribute 'y_obs' ". This was simple enough to correct and it worked after I changed the line to pm.sample_prior_predictive(1000, model).prior_predictive["y_obs"]

For the second part of the tuple there were several errors: The initial error I received was: "IncorrectArgumentsError: Should not specify both keep_size and samples arguments. See the docstring of the samples argument for more details. "

I looked online and then added the argument keep_size = False to my code: pm.sample_posterior_predictive(idata, 1000, model, keep_size=False)["y_obs"]

However this now generates the following error:

"ValueError: conflicting sizes for dimension 'chain': length 1 on the data but length 2 on coordinate 'chain' "

I'm sure there is a simple fix and don't think it will stop me continuing to use this book, but these errors definitely break the immersion with the learning process. Hopefully this can be resolved quickly.

— Reply to this email directly, view it on GitHub https://github.com/BayesianModelingandComputationInPython/BookCode_Edition1/issues/191, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXBFYKZTPBKKJ7PLP4ZA3DWFUO6JANCNFSM6AAAAAARRZCUYE . You are receiving this because you are subscribed to this thread.Message ID: <BayesianModelingandComputationInPython/BookCode_Edition1/issues/191@ github.com>

canyon289 commented 2 years ago

to be specific the package versions?

On Saturday, October 29, 2022, Ravin Kumar @.***> wrote:

apologies on the issue. are you using the code versions from the environment file?

On Saturday, October 29, 2022, I-Mohiuddin @.***> wrote:

Code is broken for listing 1.6: predictive_distributions `pred_dists = (pm.sample_prior_predictive(1000, model)["y_obs"],

      pm.sample_posterior_predictive(idata, 1000, model)["y_obs"])`

Neither the first or second part of the tuple function properly.

for the first part of the tuple the error I received was

"'AttributeError: 'InferenceData' object has no attribute 'y_obs' ". This was simple enough to correct and it worked after I changed the line to pm.sample_prior_predictive(1000, model).prior_predictive["y_obs"]

For the second part of the tuple there were several errors: The initial error I received was: "IncorrectArgumentsError: Should not specify both keep_size and samples arguments. See the docstring of the samples argument for more details. "

I looked online and then added the argument keep_size = False to my code: pm.sample_posterior_predictive(idata, 1000, model, keep_size=False)["y_obs"]

However this now generates the following error:

"ValueError: conflicting sizes for dimension 'chain': length 1 on the data but length 2 on coordinate 'chain' "

I'm sure there is a simple fix and don't think it will stop me continuing to use this book, but these errors definitely break the immersion with the learning process. Hopefully this can be resolved quickly.

— Reply to this email directly, view it on GitHub https://github.com/BayesianModelingandComputationInPython/BookCode_Edition1/issues/191, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXBFYKZTPBKKJ7PLP4ZA3DWFUO6JANCNFSM6AAAAAARRZCUYE . You are receiving this because you are subscribed to this thread.Message ID: <BayesianModelingandComputationInPython/BookCode_Edition1/issues/191@ github.com>

I-Mohiuddin commented 2 years ago

I am just using a google colab to work along with the book, so recognize that there might be package issues using pymc 4. With that being said, I think that the code should be compatible with users who are not using the exact environment specified.

pymc version: '4.1.4' arviz version : '0.12.1'

I did find a way to get to more or less the same solution using the package versions after googling around. I also had to change the image code a bit to check myself to see if it worked properly

Here is the code that I used below:

thinned_idata = idata.sel(draw=slice(None, None, 2)) # i was sampling from two chains
with model:
  idata.extend(pm.sample_posterior_predictive(thinned_idata))

  pred_dists = (pm.sample_prior_predictive().prior_predictive["y_obs"].squeeze(),
                pm.sample_posterior_predictive(thinned_idata).posterior_predictive["y_obs"].values.reshape(1,1000,20).squeeze())
fig, axes = plt.subplots(4, 1, figsize=(9, 9))

for idx, n_d, dist in zip((1, 3), ("Prior", "Posterior"), pred_dists):
    az.plot_dist(dist.sum(axis=1), hist_kwargs={"color":"0.5", "bins":range(0, 22)},
                                           ax=axes[idx])
    axes[idx].set_title(f"{n_d} predictive distribution",fontweight='bold')
    axes[idx].set_xlim(-1, 21)
    axes[idx].set_ylim(0, 0.15)
    axes[idx].set_xlabel("number of success")

az.plot_dist(pm.draw(θ, 1000), plot_kwargs={"color":"0.5"}, #had to change this line as pymc now recommending draw() rather than random() to draw from a distribution
             fill_kwargs={'alpha':1}, ax=axes[0])
axes[0].set_title("Prior distribution", fontweight='bold')
axes[0].set_xlim(0, 1)
axes[0].set_ylim(0, 4)
axes[0].tick_params(axis='both', pad=7)
axes[0].set_xlabel("θ")

az.plot_dist(idata.posterior["θ"], plot_kwargs={"color":"0.5"},
             fill_kwargs={'alpha':1}, ax=axes[2])
axes[2].set_title("Posterior distribution", fontweight='bold')
axes[2].set_xlim(0, 1)
axes[2].set_ylim(0, 5)
axes[2].tick_params(axis='both', pad=7)
axes[2].set_xlabel("θ")

Here is a link to the image https://ibb.co/30WG20g

There is likely a better way to make this code more concise. I find working with inferencedata objects to be incredibly frustrating as I do not use xarray frequently and the documentation is not the most helpful.

If there is a better way to go about formatting this code so that it is not so verbose please let me know!

canyon289 commented 2 years ago

ah that is why then. Unfortunately there is no guarantee that the code will work with any version. even if we didn't change anything with pymc, the other libraries could change their API

xarray is a bit dense but we're working on making it easier as are they.

On Saturday, October 29, 2022, I-Mohiuddin @.***> wrote:

I am just using a google colab to work along with the book, so recognize that there might be package issues using pymc 4. With that being said, I think that the code should be compatible with users who are not using the exact environment specified.

pymc version: '4.1.4' arviz version : '0.12.1'

I did find a way to get to more or less the same solution using the package versions after googling around. I also had to change the image code a bit to check myself to see if it worked properly

Here is the code that I used below:

thinned_idata = idata.sel(draw=slice(None, None, 2)) # i was sampling from two chains

with model:

idata.extend(pm.sample_posterior_predictive(thinned_idata))

pred_dists = (pm.sample_prior_predictive().prior_predictive["y_obs"].squeeze(),

            pm.sample_posterior_predictive(thinned_idata).posterior_predictive["y_obs"].values.reshape(1,1000,20).squeeze())

fig, axes = plt.subplots(4, 1, figsize=(9, 9))

for idx, n_d, dist in zip((1, 3), ("Prior", "Posterior"), pred_dists):

az.plot_dist(dist.sum(axis=1), hist_kwargs={"color":"0.5", "bins":range(0, 22)},

                                       ax=axes[idx])

axes[idx].set_title(f"{n_d} predictive distribution",fontweight='bold')

axes[idx].set_xlim(-1, 21)

axes[idx].set_ylim(0, 0.15)

axes[idx].set_xlabel("number of success")

az.plot_dist(pm.draw(θ, 1000), plot_kwargs={"color":"0.5"}, #had to change this line as pymc now recommending draw() rather than random() to draw from a distribution

         fill_kwargs={'alpha':1}, ax=axes[0])

axes[0].set_title("Prior distribution", fontweight='bold')

axes[0].set_xlim(0, 1)

axes[0].set_ylim(0, 4)

axes[0].tick_params(axis='both', pad=7)

axes[0].set_xlabel("θ")

az.plot_dist(idata.posterior["θ"], plot_kwargs={"color":"0.5"},

         fill_kwargs={'alpha':1}, ax=axes[2])

axes[2].set_title("Posterior distribution", fontweight='bold')

axes[2].set_xlim(0, 1)

axes[2].set_ylim(0, 5)

axes[2].tick_params(axis='both', pad=7)

axes[2].set_xlabel("θ")

Here is a link to the image https://ibb.co/30WG20g http://url

There is likely a better way to make this code more concise. I find working with inferencedata objects to be incredibly frustrating as I do not use xarray frequently and the documentation is not the most helpful.

If there is a better way to go about formatting this code so that it is not so verbose please let me know!

— Reply to this email directly, view it on GitHub https://github.com/BayesianModelingandComputationInPython/BookCode_Edition1/issues/191#issuecomment-1295854328, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXBFYNECAFDLCH7QBZITVDWFUYVBANCNFSM6AAAAAARRZCUYE . You are receiving this because you commented.Message ID: < BayesianModelingandComputationInPython/BookCode_Edition1/ @.***>

I-Mohiuddin commented 2 years ago

Thanks, I appreciate the responses. Are there any plans at the moment for updating the code in the book to reflect the changes in pymc4.

If not, do you know of any other recommended resources for working with pymc 4 and aesera tensors (other than the pymc docs).

Thanks!

dsantiago commented 1 year ago

I agree we can merge this update for now as the notebook really runs broken right now. I woul also suggest, if possible, test all notebook in PYMC4 and slowly make the conversion as newcomers(like me) will at least run the book examples with no problem.

canyon289 commented 1 year ago

All the notebooks will work with the versions pinned here.

https://github.com/BayesianModelingandComputationInPython/BookCode_Edition1/blob/main/environment.yml

Upgrading the textbooks constantly to the new versions would be unfeasible, to that point PyMC v4 isnt even the latest version anymore. We released PyMC V5 as well as multiple versions of ArviZ. We wouldnt be able to keep up with the speed at which we develop libraries.

The concepts however are timeless. Once you learn those once you're good for life!

Coming back to the code at some point we will do a large refresh, timing tbd ;)

dsantiago commented 1 year ago

Ohh I see... Never thought it was so fast paced like that... Anyway thanks for the time explaining it, will help future people with the same problem.

canyon289 commented 1 year ago

Of course! Were here to help people learn. Colab is also going to get updated soon so while convenient, its not a stable code environment over the long run.

Were you able to create an environment using miniconda using the instructions in the readme?

dsantiago commented 1 year ago

Yeah, I already had a conda env to study bayes modeling but, ofc, I installed all in the latest version... Doing the downgrade for now...