Closed jangorecki closed 4 years ago
Issue is still valid, just pasting full error
Traceback (most recent call last):
File "./pandas/groupby-pandas.py", line 290, in <module>
ans = x.groupby(['id1','id2','id3','id4','id5','id6']).agg({'v3':'sum', 'v1'
:'count'})
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/groupby/generic.py", line 928, in aggregate
result, how = self._aggregate(func, *args, **kwargs)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/base.py", line 419, in _aggregate
result = _agg(arg, _agg_1dim)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/base.py", line 386, in _agg
result[fname] = func(fname, agg_how)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/base.py", line 370, in _agg_1dim
return colg.aggregate(how)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/groupby/generic.py", line 247, in aggregate
return getattr(self, func)(*args, **kwargs)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/groupby/groupby.py", line 1371, in f
return self._cython_agg_general(alias, alt=npfunc, **kwargs)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/groupby/groupby.py", line 909, in _cython_agg_general
return self._wrap_aggregated_output(output)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/groupby/generic.py", line 386, in _wrap_aggregated_output
return self._reindex_output(result)._convert(datetime=True)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/groupby/groupby.py", line 2483, in _reindex_output
levels_list, names=self.grouper.names
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/indexes/multi.py", line 552, in from_product
codes = cartesian_product(codes)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/reshape/util.py", line 58, in cartesian_product
for i, x in enumerate(X)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
pandas/core/reshape/util.py", line 58, in <listcomp>
for i, x in enumerate(X)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
numpy/core/fromnumeric.py", line 445, in repeat
return _wrapfunc(a, 'repeat', repeats, axis=axis)
File "/home/jan/git/db-benchmark/pandas/py-pandas/lib/python3.6/site-packages/
numpy/core/fromnumeric.py", line 51, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
MemoryError
It is not a recent issue but now spotted. q10 used to work on pandas 0.24.2 but now is hitting memory error. Affects both 1e7 and 1e8 data sizes. Reported in https://github.com/pandas-dev/pandas/issues/32918