I am using cymetic to analyze the SQLite output of a fairly large Cyclus simulation (the database is about 400 MB). When I use some of the metrics (like 'Materials' and any that rely on it like 'TransactionQuantity') I encounter a MemoryError; it can't allocate a certain amount of memory to an array of the specified size. Amounts of memory that it told me it can't allocate range between 571 MiB-3.91 GiB. I changed my setting to allow over commit memory, but doing this just leads the the kernel dying rather than returning a MemoryError.
I am running 64-bit python3 on a 64-bit Ubuntu 18.04 system with 32 GB of memory.
The error seems to stem from the pd.merge or set_index operation in the 'Materials' metric
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-4-91d85b0e404a> in <module>()
----> 1 evaler.eval('Materials')
/home/amandabachmann/.local/lib/python3.6/site-packages/cymetric/evaluator.py in eval(self, metric, conds)
58 frame = self.eval(dep, conds=conds)
59 frames.append(frame)
---> 60 raw = m(frames=frames, conds=conds, known_tables=self.known_tables)
61 if raw is None:
62 return raw
/home/amandabachmann/.local/lib/python3.6/site-packages/cymetric/metrics.py in __call__(self, frames, conds, known_tables, *args, **kwargs)
75 if self.name in known_tables:
76 return self.db.query(self.name, conds=conds)
---> 77 return f(*frames)
78
79 Cls.__name__ = str(name)
/home/amandabachmann/.local/lib/python3.6/site-packages/cymetric/metrics.py in materials(rsrcs, comps)
118 x = pd.merge(rsrcs, comps, on=['SimId', 'QualId'], how='inner')
119 x = x.set_index(['SimId', 'QualId', 'ResourceId', 'ObjId', 'TimeCreated',
--> 120 'NucId', 'Units'])
121 y = x['Quantity'] * x['MassFrac']
122 y.name = 'Mass'
/home/amandabachmann/anaconda3/envs/cyclus-env/lib/python3.6/site-packages/pandas/core/frame.py in set_index(self, keys, drop, append, inplace, verify_integrity)
4607
4608 # clear up memory usage
-> 4609 index._cleanup()
4610
4611 frame.index = index
/home/amandabachmann/anaconda3/envs/cyclus-env/lib/python3.6/site-packages/pandas/core/indexes/base.py in _cleanup(self)
546
547 def _cleanup(self):
--> 548 self._engine.clear_mapping()
549
550 @cache_readonly
pandas/_libs/properties.pyx in pandas._libs.properties.CachedProperty.__get__()
/home/amandabachmann/anaconda3/envs/cyclus-env/lib/python3.6/site-packages/pandas/core/indexes/multi.py in _engine(self)
1000 if lev_bits[0] > 64:
1001 # The levels would overflow a 64 bit uint - use Python integers:
-> 1002 return MultiIndexPyIntEngine(self.levels, self.codes, offsets)
1003 return MultiIndexUIntEngine(self.levels, self.codes, offsets)
1004
pandas/_libs/index.pyx in pandas._libs.index.BaseMultiIndexCodesEngine.__init__()
MemoryError: Unable to allocate 3.91 GiB for an array with shape (74887355, 7) and data type int64
I am using cymetic to analyze the SQLite output of a fairly large Cyclus simulation (the database is about 400 MB). When I use some of the metrics (like 'Materials' and any that rely on it like 'TransactionQuantity') I encounter a MemoryError; it can't allocate a certain amount of memory to an array of the specified size. Amounts of memory that it told me it can't allocate range between 571 MiB-3.91 GiB. I changed my setting to allow over commit memory, but doing this just leads the the kernel dying rather than returning a MemoryError.
I am running 64-bit python3 on a 64-bit Ubuntu 18.04 system with 32 GB of memory.
The error seems to stem from the
pd.merge
orset_index
operation in the 'Materials' metric