arviz-devs / arviz

Exploratory analysis of Bayesian models with Python
https://python.arviz.org
Apache License 2.0
1.56k stars 388 forks source link

Fix cmdstanpy converter and CI #2278

Closed OriolAbril closed 8 months ago

OriolAbril commented 10 months ago

Description

Removes all use of the metadata object, now only the attribute presence is checked. That is because the attribute is public, but the class returned is part of the private API, so we should not rely on its information or attributes.

Fixes #2276, related to https://github.com/stan-dev/cmdstanpy/issues/693.

I might break integration with other older versions, I'll try to check and add a not supported informative error. Older ArviZ should be used for older cmdstanpy.

I have also updated the test infrastructure a bit to make it easier to regenerate the csv files we have in saved_models folder. We still use those to avoid installing and running stan for inference, but if the folder is missing/deleted, then the files are regenerated and updated automatically before running the tests.

Checklist


:books: Documentation preview :books:: https://arviz--2278.org.readthedocs.build/en/2278/

ahartikainen commented 10 months ago

I would not break the older versions.

Let's add similar checks as we have other older cmdstanpy stuff

OriolAbril commented 10 months ago

My priority right now is fixing CI so @tomicapretto and the sprint participants can submit PRs and not see everything red. And I don't have a lot of time available so I can't check all cmdstanpy versions to make sure. I do think I have not broken anything though, and all tests do seem to pass now.

codecov[bot] commented 10 months ago

Codecov Report

Merging #2278 (38cd390) into main (d163000) will increase coverage by 0.46%. The diff coverage is 85.00%.

@@            Coverage Diff             @@
##             main    #2278      +/-   ##
==========================================
+ Coverage   87.87%   88.33%   +0.46%     
==========================================
  Files         122      122              
  Lines       12459    12442      -17     
==========================================
+ Hits        10948    10991      +43     
+ Misses       1511     1451      -60     
Files Changed Coverage Δ
arviz/data/io_cmdstanpy.py 77.18% <85.00%> (+16.54%) :arrow_up:

... and 1 file with indirect coverage changes

ahartikainen commented 10 months ago

I will switch off cmdstanpy tests and let's fix this correctly.

OriolAbril commented 10 months ago

sounds good, thanks! it might also be a good idea to start testing this manually on multiple cmdstanpy releases and see how it goes in case it works already

WardBrian commented 10 months ago

I believe this should work as far back as 0.9.77, which added stan_variables and method_variables.

Note that the stan_variables dictionary isn't cached on the cmdstan side, so these calls here could lead to more computation than you'd expect -- it may be worth storing the keys versus multiple calls, especially since it seems like you mostly care about the names

lzachmann commented 8 months ago

The changes in this PR work for me. Curious if there's a plan for when this might be folded into a new version of arviz? Having this work out of the box would greatly simplify my cmdstanpy workflow.

OriolAbril commented 8 months ago

I have done some tests locally (expand details tag to see all the results). I did catch an error to trigger the old logic, but there are some versions in between 0.9.77 and 0.9.67 that would require a 3rd logic (so we'd have 3 converters in one). I think this is worth it maintenance wise, nor I think it is worth it to hold this PR.

The converter is currently broken, so unless someone can take over this and fix the converter for all cases we have to choose between supporting the latest cmdstanpy or the 0.9.77-0.9.67 window. I would like to make a new release next week, if we have a better fix I will use that (or wait for that other PR before releasing), otherwise I will merge this PR so we can released the fix for latest cmdstanpy.

Moreover, given that out of these 3 converters (already happens with our 2nd one) we'd only test the newest one, it is quite possible that we might break them without realizing.

``` ❯ python -c "import cmdstanpy; print(cmdstanpy.__version__)" 1.2.0 ❯ pytest arviz/tests/external_tests/test_data_cmdstanpy.py ==================================================================================== test session starts ===================================================================================== platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.3.0 rootdir: /home/oriol/Documents/repos_oss/arviz configfile: pytest.ini plugins: anyio-4.0.0 collected 3 items arviz/tests/external_tests/test_data_cmdstanpy.py ... [3/3] ==================================================================================== slowest 20 durations ==================================================================================== 0.13s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data 0.12s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data_warmup 0.09s setup arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats 0.03s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats (5 durations < 0.005s hidden. Use -vv to show these durations.) ===================================================================================== 3 passed in 0.42s ====================================================================================== ``` ``` ❯ python -c "import cmdstanpy; print(cmdstanpy.__version__)" 1.1.0 ❯ pytest arviz/tests/external_tests/test_data_cmdstanpy.py ==================================================================================== test session starts ===================================================================================== platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.3.0 rootdir: /home/oriol/Documents/repos_oss/arviz configfile: pytest.ini plugins: anyio-4.0.0 collected 3 items arviz/tests/external_tests/test_data_cmdstanpy.py ... [3/3] ==================================================================================== slowest 20 durations ==================================================================================== 0.17s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data 0.15s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data_warmup 0.10s setup arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats 0.03s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats (5 durations < 0.005s hidden. Use -vv to show these durations.) ===================================================================================== 3 passed in 0.55s ====================================================================================== ``` ``` ❯ python -c "import cmdstanpy; print(cmdstanpy.__version__)" 1.0.8 ❯ pytest arviz/tests/external_tests/test_data_cmdstanpy.py ==================================================================================== test session starts ===================================================================================== platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.3.0 rootdir: /home/oriol/Documents/repos_oss/arviz configfile: pytest.ini plugins: anyio-4.0.0 collected 3 items arviz/tests/external_tests/test_data_cmdstanpy.py ... [3/3] ==================================================================================== slowest 20 durations ==================================================================================== 0.16s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data 0.15s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data_warmup 0.09s setup arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats 0.04s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats (5 durations < 0.005s hidden. Use -vv to show these durations.) ===================================================================================== 3 passed in 0.45s ====================================================================================== ``` ``` ❯ python -c "import cmdstanpy; print(cmdstanpy.__version__)" 1.0.0 ❯ pytest arviz/tests/external_tests/test_data_cmdstanpy.py ==================================================================================== test session starts ===================================================================================== platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.3.0 rootdir: /home/oriol/Documents/repos_oss/arviz configfile: pytest.ini plugins: anyio-4.0.0 collected 3 items arviz/tests/external_tests/test_data_cmdstanpy.py ... [3/3] ==================================================================================== slowest 20 durations ==================================================================================== 0.16s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data 0.15s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data_warmup 0.06s setup arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats 0.03s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats (5 durations < 0.005s hidden. Use -vv to show these durations.) ===================================================================================== 3 passed in 0.41s ====================================================================================== ``` ``` ❯ python -c "import cmdstanpy; print(cmdstanpy.__version__)" 0.9.77 deleting tmpfiles dir: /tmp/tmpi6vlbm1v done ❯ pytest arviz/tests/external_tests/test_data_cmdstanpy.py ==================================================================================== test session starts ===================================================================================== platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.3.0 rootdir: /home/oriol/Documents/repos_oss/arviz configfile: pytest.ini plugins: anyio-4.0.0 collected 3 items arviz/tests/external_tests/test_data_cmdstanpy.py ... [3/3] ==================================================================================== slowest 20 durations ==================================================================================== 0.15s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data 0.14s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data_warmup 0.04s setup arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats 0.03s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats (5 durations < 0.005s hidden. Use -vv to show these durations.) ===================================================================================== 3 passed in 0.38s ====================================================================================== ``` ``` ❯ python -c "import cmdstanpy; print(cmdstanpy.__version__)" 0.9.75 deleting tmpfiles dir: /tmp/tmpqik5kt4k done ❯ pytest arviz/tests/external_tests/test_data_cmdstanpy.py ==================================================================================== test session starts ===================================================================================== platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.3.0 rootdir: /home/oriol/Documents/repos_oss/arviz configfile: pytest.ini plugins: anyio-4.0.0 collected 3 items arviz/tests/external_tests/test_data_cmdstanpy.py FFF [3/3] ========================================================================================== FAILURES ========================================================================================== [...] E AttributeError: 'CmdStanMCMC' object has no attribute 'method_variables'. Did you mean: 'stan_variables'? ``` ``` ❯ python -c "import cmdstanpy; print(cmdstanpy.__version__)" 0.9.68 deleting tmpfiles dir: /tmp/tmpe8a83eng done ❯ pytest arviz/tests/external_tests/test_data_cmdstanpy.py ==================================================================================== test session starts ===================================================================================== platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.3.0 rootdir: /home/oriol/Documents/repos_oss/arviz configfile: pytest.ini plugins: anyio-4.0.0 collected 3 items arviz/tests/external_tests/test_data_cmdstanpy.py FFF [3/3] ========================================================================================== FAILURES ========================================================================================== [...] E AttributeError: 'CmdStanMCMC' object has no attribute 'method_variables'. Did you mean: 'stan_variables'? ``` ``` ❯ python -c "import cmdstanpy; print(cmdstanpy.__version__)" 0.9.67 deleting tmpfiles dir: /tmp/tmpkir40uv9 done ❯ pytest arviz/tests/external_tests/test_data_cmdstanpy.py ==================================================================================== test session starts ===================================================================================== platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.3.0 rootdir: /home/oriol/Documents/repos_oss/arviz configfile: pytest.ini plugins: anyio-4.0.0 collected 3 items arviz/tests/external_tests/test_data_cmdstanpy.py ... [3/3] ==================================================================================== slowest 20 durations ==================================================================================== 0.13s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data 0.12s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_inference_data_warmup 0.04s setup arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats 0.03s call arviz/tests/external_tests/test_data_cmdstanpy.py::TestDataCmdStanPy::test_sampler_stats (5 durations < 0.005s hidden. Use -vv to show these durations.) ===================================================================================== 3 passed in 0.33s ====================================================================================== deleting tmpfiles dir: /tmp/tmpm6fr7m64 done ```
ahartikainen commented 8 months ago

Ok, then I would say that we just go forward and support only the latest API.

Is there something we are going to lose compared to older transformer?

WardBrian commented 8 months ago

If someone requires a CmdStanPy from over two years ago, they are probably also going to be using an older Arviz, to be fair

ahartikainen commented 8 months ago

Yeah, and updating cmdstanpy should be easy and we always have support for cmdstan.

What we probably need to figure out is the tuple parameters (we probably want to unpack them, and also dtype extraction function)

WardBrian commented 8 months ago

I updated the stanio package to support numpy custom dtypes for tuples, so cmdstanpy can if we just accept extra kwargs in a few places.

ahartikainen commented 8 months ago

I will close this for now. But I like the changes in tests, maybe we could add them?