Closed edvos-sw closed 2 years ago
Is there a reason you need to read from a file you just overwrote? Why not just use the df that's already in memory since it should be identical? Asking because code is built around real-use cases.
The code was just for recreation purpose. It happens when I read any parquet file
El lun., 2 may. 2022 23:44, RyuuOujiXS @.***> escribió:
Is there a reason you need to read from a file you just overwrote? Why not just use the df that's already in memory since it should be identical? Asking because code is built around real-use cases.
— Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/46890#issuecomment-1115399850, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOD3ZKTFPM7WKZCP3PKQCFLVIBECJANCNFSM5USH2F6A . You are receiving this because you authored the thread.Message ID: @.***>
I have the same issue: appending a column to the index works fine while running, but fails when in debug mode. I'm using Spyder 5.3.0
on Windows with pandas 1.4.2
.
I've created some dummy code that shows the problem:
import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3], "b": [100, 200, 300], "c": ["a", "b", "c"]})
df.set_index("a", inplace=True)
df.set_index("b", append=True, inplace=True)
print(df)
print(df.index)
Running this without debugging returns ✔️ :
c
a b
1 100 a
2 200 b
3 300 c
MultiIndex([(1, 100),
(2, 200),
(3, 300)],
names=['a', 'b'])
Running this with debugging in Spyder returns ❌ :
Traceback (most recent call last):
File "C:\Users\username\Miniconda3\envs\some-env\lib\site-packages\spyder_kernels\customize\spyderpdb.py", line 776, in run
super(SpyderPdb, self).run(cmd, globals, locals)
File "C:\Users\username\Miniconda3\envs\some-env\lib\bdb.py", line 597, in run
exec(cmd, globals, locals)
File "c:\users\username\path\temp.py", line 6, in <module>
df.set_index("b", append=True, inplace=True)
File "C:\Users\username\Miniconda3\envs\some-env\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "C:\Users\username\Miniconda3\envs\some-env\lib\site-packages\pandas\core\frame.py", line 5560, in set_index
index._cleanup()
File "C:\Users\username\Miniconda3\envs\some-env\lib\site-packages\pandas\core\indexes\base.py", line 843, in _cleanup
self._engine.clear_mapping()
File "pandas\_libs\properties.pyx", line 37, in pandas._libs.properties.CachedProperty.__get__
File "C:\Users\username\Miniconda3\envs\some-env\lib\site-packages\pandas\core\indexes\multi.py", line 1097, in _engine
return MultiIndexUIntEngine(self.levels, self.codes, offsets)
File "pandas\_libs\index.pyx", line 635, in pandas._libs.index.BaseMultiIndexCodesEngine.__init__
File "C:\Users\username\Miniconda3\envs\some-env\lib\site-packages\pandas\core\indexes\multi.py", line 136, in _codes_to_ints
codes <<= self.offsets
AttributeError: 'MultiIndex' object has no attribute 'offsets'
I thought I would work around it with:
# df.set_index("b", append=True, inplace=True)
df = df.reset_index().set_index(["a", "b"])
But the same issue persists.
yeah, it seems to be some issue with debugging in spyder last version. Maybe it is spyder and not pandas
have you reported to spyder
? debugging that code with pdb
works fine for me
As it only appears to happen with spyder
I agree it's probably their issue. However, the error does appear to come from the pandas
codebase, so perhaps it's good to have it here as well?
yep, seems like a problem with pandas, spyder and python 3.10 @MarcoGorelli what version of python did you use?
3.8
can you try with python 3.10?
The Spyder issue was closed as:
... was able to reproduce it in terminal IPython, I think this is not a Spyder problem but a Pandas one.
The Spyder issue has an environment specification that reproduces this issue. Is there anything else I can provide to help resolve this issue?
Same problem here with read_feather from pandas
can you try with python 3.10?
Thanks - yup, can reproduce with Python3.10!
To reproduce:
t.py
with:
import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3], "b": [100, 200, 300], "c": ["a", "b", "c"]})
df.set_index("a", inplace=True) import ipdb; ipdb.set_trace() df.set_index("b", append=True, inplace=True)
2. make sure you have `ipdb` installed
3. run `python t.py`, and at the breakpoint, press `n`
we get
(venv310) marcogorelli@OVMG025 tmp % python t.py
/Users/marcogorelli/tmp/t.py(7)
() 6 import ipdb; ipdb.set_trace() ----> 7 df.set_index("b", append=True, inplace=True) 8
ipdb> n AttributeError: 'MultiIndex' object has no attribute 'offsets'
---
Note: this only happens with `ipdb`, not with `pdb` - so perhaps the issue is there?
So I cannot debug on spyder if working with pandas on an environment ? this is a major problem considering I'm working on a big project and I have to control some functions independently
yep, that's the problem
So I cannot debug on spyder if working with pandas on an environment ?
Well, that's a bit of a broad statement... As stated in this comment:
As a workaround please use Python <=3.9
So, if you simply specify python=3.9
in your conda environment then this issue should not occur.
This looks related to #41935. Is this still an issue in 1.4.3?
I have recreated the environment linked to earlier, updated pandas
to 1.4.3 and updated spyder-kernels
to 2.3.2.
Then I tested the code snippet I posted earlier. This now works as expected.
I also installed ipdb
and tested MarcoGorelli's example. This now also works as expected.
So it seems this issue is resolved, thanks!
environment.yml
fileThanks for checking @ba-tno, closing then!
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
IMPORTANT: This only happens when debugging on line: pd.read_parquet('test.parquet') I am using spyder on anaconda. I can provide dependencies if necessary.
Expected Behavior
Read parquet file
Installed Versions