ikmckenz / target-pred-py

A simple machine learning model for small-molecule target prediction in Python.
GNU General Public License v3.0
18 stars 8 forks source link

Something wrong with memory #34

Open Zzz233 opened 1 year ago

Zzz233 commented 1 year ago

When i try: python build_features.py I got this error:

Building training features Traceback (most recent call last): File "D:\ZZZ_dev\target-pred-py\src\features\build_features.py", line 113, in Features().save_training_features(overwrite=True, parallel=5) File "D:\ZZZ_dev\target-pred-py\src\features\build_features.py", line 59, in save_training_features X, y, y_transform = self.build_training_features(parallel=parallel) File "D:\ZZZ_dev\target-pred-py\src\features\build_features.py", line 39, in build_training_features pool.map(self.get_numpy_fingerprint_from_smiles_series, File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 771, in get raise self._value multiprocessing.pool.MaybeEncodingError: Error sending result: '[336054 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... 336055 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... 336056 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... 336057 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... 336058 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... ...
448067 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... 448068 [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... 448069 [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... 448070 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... 448071 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... Name: canonical_smiles, Length: 112018, dtype: object]'. Reason: 'MemoryError()'

Process finished with exit code 1

ikmckenz commented 1 year ago

How much RAM does your system have? Unfortunately most/all of the operations in this repository are currently quite memory heavy, and will error if you don't have enough system RAM. I would welcome PR's that refactor the code to better handle chucks of data and reduce the memory requirements.