Split convert observed data

michaelosthege commented 1 month ago

Description

convert_observed_data is a messy do-it-all conversion function that was recently refactored in #7299.

This PR takes the next step and splits it such that generator data is treated separately.

Related Issue

Makes it easier to work on #7277.

Checklist

[x] Checked that the pre-commit linting/style checks pass
[x] Included tests that prove the fix is effective or that the new feature works
[x] Added necessary documentation (docstrings and/or example notebooks)
[x] If you are a pro: each commit corresponds to a relevant logical change

Type of change

[ ] New feature / enhancement
[x] Bug fix
[ ] Documentation
[x] Maintenance
[x] Other (please specify):
- Generator data is no longer supported with pm.Data.
- Generator data of integer type is no longer converted to floatX, but only to intX.

📚 Documentation preview 📚: https://pymc--7334.org.readthedocs.build/en/7334/

ricardoV94 commented 1 month ago

I'm still of the opinion we should drop generators altogether. I don't find any documentation on how to use it, and I can't really think of all the implications this Op means inside PyTensor, which is supposed to be side-effects free unless told otherwise (it's not being told so here). It's also only applicable in the default backend / and even here only if you have models without Deterministics (or any workflow where separate functions are compiled from the same underlying graph).

If this is useful in VI, the machinery that creates and call functions can use regular Python code to work with generators (such as setting a shared variable to the next value in every iteration of the loop). I don't see why we need this magic at the PyTensor level.

michaelosthege commented 1 month ago

I'll add a depreaction warning to the conversion of generator data, asking people to get in touch if this is relevant.

michaelosthege commented 1 month ago

[x] Rebased
[x] Added deprecation warning
[x] Hopefully fixed the tests by defaulting to floatX conversion when user-provided data does not carry dtype information.

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.45%. Comparing base (a197b19) to head (d9bd20c). Report is 2 commits behind head on main.

Additional details and impacted files

[![Impacted file tree graph](https://app.codecov.io/gh/pymc-devs/pymc/pull/7334/graphs/tree.svg?width=650&height=150&src=pr&token=JFuXtOJ4Cb&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs)](https://app.codecov.io/gh/pymc-devs/pymc/pull/7334?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs) ```diff @@ Coverage Diff @@ ## main #7334 +/- ## ========================================== - Coverage 92.47% 92.45% -0.03% ========================================== Files 102 102 Lines 17188 17180 -8 ========================================== - Hits 15894 15883 -11 - Misses 1294 1297 +3 ``` | [Files](https://app.codecov.io/gh/pymc-devs/pymc/pull/7334?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs) | Coverage Δ | | |---|---|---| | [pymc/data.py](https://app.codecov.io/gh/pymc-devs/pymc/pull/7334?src=pr&el=tree&filepath=pymc%2Fdata.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs#diff-cHltYy9kYXRhLnB5) | `89.44% <100.00%> (+0.13%)` | :arrow_up: | | [pymc/pytensorf.py](https://app.codecov.io/gh/pymc-devs/pymc/pull/7334?src=pr&el=tree&filepath=pymc%2Fpytensorf.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs#diff-cHltYy9weXRlbnNvcmYucHk=) | `90.38% <100.00%> (-0.53%)` | :arrow_down: | ... and [20 files with indirect coverage changes](https://app.codecov.io/gh/pymc-devs/pymc/pull/7334/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pymc-devs)

pymc-devs / pymc