Open tacaswell opened 2 years ago
x[:, None] # fails with ValueError
This goes through FloatingArray.__getitem__
, which raises bc 2D FloatingArrays are not supported (much to my consternation). (Actually on master this returns a 2D object-dtype ndarray, which i could imagine being worse)
z[:, None]
This goes through ndarray.__getitem__
which works just how you would expect.
IIRC the plan is to enforce the deprecation in 2.0 so these would both raise.
I think we need to revisit this and make sure we are not both grumbling about the other and putting in dueling workarounds ;)
Do you need series[:, None]
to work, or can will this deprecation being enforced be OK on your end?
We currently have a bunch of complicated warning contexts / try ... excepts to handle this, if both raised we would be perfectly happy! The issue is that we were not catching ValueError
so a spurious error made it out to our user. I would want either in the FloatingArray
case the series __getitem__
to catch the ValueError
and re-raise an InedxingError
or to have assurances that when the deprecation go through both raise ValueError
.
The context we are using this in is https://github.com/matplotlib/matplotlib/blob/8d7a2b9d2a38f01ee0d6802dd4f9e98aec812322/lib/matplotlib/cbook/__init__.py#L1301-L1343 . We are using the failure of series[:, None]
to identify we have been passed a pandas object and make no further use of the result.
We are also open to better suggestions about how to handle this (without importing pandas). I guess we would do some if 'pandas' in sys.modules
logic, but that seems a bit messy.
https://github.com/pandas-dev/pandas/pull/30588 is probably also relevant.
if both raised we would be perfectly happy [...] or to have assurances that when the deprecation go through both raise ValueError
Both will raise in 2.0. I'm not sure if we have pinned down exactly what type of exception will be raised, but ValueError seems like a reasonable option.
We are also open to better suggestions about how to handle this (without importing pandas). I guess we would do some if 'pandas' in sys.modules logic, but that seems a bit messy.
With the appropriate caveats about relying on implementation details: most (non-scalar) pandas classes have a "_typ: str" attribute that are used to bootstrap isinstance
checks without circular imports.
To add a new dimension to the existing data frame, by using values method we could do it. series will be failed to add dimension, Here everything depends on the series type we use. Float series type has that issue not the Integer.
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the master branch of pandas.
Reproducible Example
Issue Description
Per discussion around https://github.com/pandas-dev/pandas/issues/35527 / https://github.com/matplotlib/matplotlib/issues/18158and related links there have been a bunch of issues about multi-dimensional indexing in to
Series
and objects that have inconsistent.ndims
and.shape
.In a departure from
numpy
it was my understanding that the implicit broadcasting to higher dimensions was going to be dropped by pandas at some point in the future (although it seems to still warn for builtin types). However, for the new missing-data types trying to do this 2D slicing raises aValueError
which was reported as a bug to Matplotlib https://github.com/matplotlib/matplotlib/issues/22125Given
I think we need to revisit this and make sure we are not both grumbling about the other and putting in dueling workarounds ;)
Expected Behavior
Behavior of the container to be independent of the contained type.
Installed Versions