statisticalbiotechnology / triqler

The triqler (TRansparent Identification-Quantification-linked Error Rates)'s source and example code
Apache License 2.0
19 stars 9 forks source link

Memory issue with protein PEP #15

Closed stharan closed 1 year ago

stharan commented 3 years ago

Hi,

I'm trying to process a dataset and run into what seems like memory issues while computing the protien PEPs. I've attached my input file and the python out below. I've already try to increase my pagesize (all the way to 256gb) and haven't found it to help.

Best,

Tharan HYE124_LFQ_4dFF_denoise_w_niceClusteringSegmented_forTriqler.txt

C:\ProgramData\Anaconda3\lib\site-packages\numpy_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs: C:\ProgramData\Anaconda3\lib\site-packages\numpy.libs\libopenblas.TXA6YQSD3GCQQC22GEQ54J2UDCXDXHWN.gfortran-win_amd64.dll C:\ProgramData\Anaconda3\lib\site-packages\numpy.libs\libopenblas.WCDJNK7YVMPZQ2ME2ZZHJJRJ3JIKNDB7.gfortran-win_amd64.dll stacklevel=1) Triqler version 0.6.1 Copyright (c) 2018-2020 Matthew The. All rights reserved. Written by Matthew The (matthew.the@scilifelab.se) in the School of Engineering Sciences in Chemistry, Biotechnology and Health at the Royal Institute of Technology in Stockholm. Issued command: triqler.py HYE124_LFQ_4dFF_denoise_w_niceClusteringSegmented_forTriqler.txt Parsing triqler input file Reading row 0 Calculating identification PEPs Identified 175155 PSMs at 1% FDR Selecting best feature per run and spectrum featureGroupIdx: 0 Dividing intensities by 10 for increased readability Calculating peptide-level identification PEPs Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Warning: IRLS did not converge with maxIter = 50 Identified 46420 peptides at 1% FDR Writing peptide quant rows to file: HYE124_LFQ_4dFF_denoise_w_niceClusteringSegmented_forTriqler.txt.pqr.tsv Calculating protein-level identification PEPs Identified 6694 proteins at 1% FDR Fitting hyperparameters params["muDetect"], params["sigmaDetect"] = 2.015933, 0.246916 params["muXIC"], params["sigmaXIC"] = 2.857317, 0.503958 params["muProtein"], params["sigmaProtein"] = -0.001369, 0.080588 params["muFeatureDiff"], params["sigmaFeatureDiff"] = -0.006158, 0.055469 params["shapeInGroupStdevs"], params["scaleInGroupStdevs"] = 1.004936, 0.020232 Minimum advisable --fold_change_eval: 0.67 Calculating protein posteriors Process SpawnPoolWorker-4: Process SpawnPoolWorker-18: Process SpawnPoolWorker-34: Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 297, in _bootstrap self.run() File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 99, in run self._target(*self._args, self._kwargs) File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 110, in worker task = get() File "C:\ProgramData\Anaconda3\lib\multiprocessing\queues.py", line 354, in get return _ForkingPickler.loads(res) File "C:\ProgramData\Anaconda3\lib\site-packages\triqler\pgm.py", line 8, in from scipy.stats import f_oneway, gamma File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats__init__.py", line 379, in from .stats import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\stats.py", line 182, in from . import distributions Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\distributions.py", line 10, in from ._distn_infrastructure import (entropy, rv_discrete, rv_continuous, File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 297, in _bootstrap self.run() File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats_distn_infrastructure.py", line 24, in from scipy import optimize File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 99, in run self._target(self._args, self._kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize__init.py", line 391, in from ._minimize import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize_minimize.py", line 35, in from .cobyla import _minimize_cobyla File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\cobyla.py", line 17, in from scipy.optimize import _cobyla ImportError: DLL load failed: The paging file is too small for this operation to complete. Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 297, in _bootstrap self.run() File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 99, in run self._target(self._args, *self._kwargs) File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 110, in worker task = get() File "C:\ProgramData\Anaconda3\lib\multiprocessing\queues.py", line 354, in get return _ForkingPickler.loads(res) File "C:\ProgramData\Anaconda3\lib\site-packages\triqler\pgm.py", line 8, in from scipy.stats import f_oneway, gamma File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats__init__.py", line 379, in from .stats import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\stats.py", line 180, in import scipy.special as special File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\special__init.py", line 643, in from .basic import * File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\special\basic.py", line 19, in from . import orthogonal File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\special\orthogonal.py", line 83, in from scipy import linalg File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\linalg\init__.py", line 213, in from ._sketches import * File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\linalg_sketches.py", line 11, in from scipy.sparse import csc_matrix File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\sparse\init.py", line 230, in from .csr import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\sparse\csr.py", line 13, in from ._sparsetools import (csr_tocsc, csr_tobsr, csr_count_blocks, ImportError: DLL load failed: The paging file is too small for this operation to complete. Exception in thread Thread-1: Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\threading.py", line 926, in _bootstrap_inner self.run() File "C:\ProgramData\Anaconda3\lib\threading.py", line 870, in run self._target(self._args, **self._kwargs) File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 412, in _handle_workers pool._maintain_pool() File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 248, in _maintain_pool self._repopulate_pool() File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 241, in _repopulate_pool w.start() File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 112, in start self._popen = self._Popen(self) File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "C:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in init reduction.dump(process_obj, to_child) File "C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 535, in reduce__ 'pool objects cannot be passed between processes or pickled' NotImplementedError: pool objects cannot be passed between processes or pickled

Traceback (most recent call last): File "", line 1, in File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 110, in worker task = get() File "C:\ProgramData\Anaconda3\lib\multiprocessing\queues.py", line 354, in get return _ForkingPickler.loads(res) File "C:\ProgramData\Anaconda3\lib\site-packages\triqler\pgm.py", line 8, in from scipy.stats import f_oneway, gamma File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats__init.py", line 379, in from .stats import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\stats.py", line 180, in import scipy.special as special File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\special__init__.py", line 643, in from .basic import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\special\basic.py", line 19, in from . import orthogonal File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\special\orthogonal.py", line 83, in from scipy import linalg File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\linalg__init.py", line 213, in from ._sketches import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\linalg_sketches.py", line 11, in from scipy.sparse import csc_matrix File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\sparse__init__.py", line 232, in from .lil import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\sparse\lil.py", line 20, in from . import _csparsetools ImportError: DLL load failed: The paging file is too small for this operation to complete. Process SpawnPoolWorker-26: Process SpawnPoolWorker-16: Process SpawnPoolWorker-27: Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 297, in _bootstrap self.run() File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 99, in run self._target(*self._args, *self._kwargs) File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 110, in worker task = get() File "C:\ProgramData\Anaconda3\lib\multiprocessing\queues.py", line 354, in get return _ForkingPickler.loads(res) File "C:\ProgramData\Anaconda3\lib\site-packages\triqler\pgm.py", line 8, in from scipy.stats import f_oneway, gamma File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats__init__.py", line 379, in from .stats import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\stats.py", line 182, in from . import distributions File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\distributions.py", line 10, in from ._distn_infrastructure import (entropy, rv_discrete, rv_continuous, File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats_distn_infrastructure.py", line 24, in from scipy import optimize File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize__init__.py", line 391, in from ._minimize import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize_minimize.py", line 33, in from .lbfgsb import _minimize_lbfgsb File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 40, in from . import _lbfgsb Traceback (most recent call last): ImportError: DLL load failed: The paging file is too small for this operation to complete. File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 297, in _bootstrap self.run() File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 99, in run self._target(self._args, **self._kwargs) File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 110, in worker task = get() File "C:\ProgramData\Anaconda3\lib\multiprocessing\queues.py", line 354, in get return _ForkingPickler.loads(res) File "C:\ProgramData\Anaconda3\lib\site-packages\triqler\pgm.py", line 8, in from scipy.stats import f_oneway, gamma File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\init.py", line 379, in from .stats import * File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\stats.py", line 182, in from . import distributions File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\distributions.py", line 10, in from ._distn_infrastructure import (entropy, rv_discrete, rv_continuous, File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats_distn_infrastructure.py", line 27, in from scipy import integrate File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\init__.py", line 92, in from ._ode import File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate_ode.py", line 93, in from . import _dop ImportError: DLL load failed: The paging file is too small for this operation to complete. Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 297, in _bootstrap self.run() File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 99, in run self._target(self._args, **self._kwargs) File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 110, in worker task = get() File "C:\ProgramData\Anaconda3\lib\multiprocessing\queues.py", line 354, in get return _ForkingPickler.loads(res) File "C:\ProgramData\Anaconda3\lib\site-packages\triqler\pgm.py", line 8, in from scipy.stats import f_oneway, gamma File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\init.py", line 379, in from .stats import * File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\stats.py", line 182, in from . import distributions File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\distributions.py", line 10, in from ._distn_infrastructure import (entropy, rv_discrete, rv_continuous, File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats_distn_infrastructure.py", line 24, in from scipy import optimize File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\init__.py", line 391, in from ._minimize import * File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize_minimize.py", line 33, in from .lbfgsb import _minimize_lbfgsb File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 40, in from . import _lbfgsb ImportError: DLL load failed: The paging file is too small for this operation to complete. Warning: failed to converge for protein P07395|SYFB_ECOLI Warning: failed to converge for protein P0A6Z3|HTPG_ECOLI Warning: failed to converge for protein P31120|GLMM_ECOLI Warning: failed to converge for protein P38737|ECM29_YEAST Warning: failed to converge for protein P38810|SFB3_YEAST Warning: failed to converge for protein P37330|MASZ_ECOLI Warning: failed to converge for protein P42588|PAT_ECOLI Warning: failed to converge for protein P38821|DNPEP_YEAST Warning: failed to converge for protein P0A705|IF2_ECOLI Warning: failed to converge for protein P0A9M8|PTA_ECOLI Warning: failed to converge for protein P22855|MAN1_YEAST Warning: failed to converge for protein P32337|IMB3_YEAST Warning: failed to converge for protein P63284|CLPB_ECOLI Warning: failed to converge for protein P38903|2A5D_YEAST Warning: failed to converge for protein Q04660|ERB1_YEAST Warning: failed to converge for protein Q99287|SEY1_YEAST Warning: failed to converge for protein P36037|DOA1_YEAST Warning: failed to converge for protein Q03280|TOM1_YEAST Warning: failed to converge for protein P0DTT0|BIPA_ECOLI Warning: failed to converge for protein P27298|OPDA_ECOLI Warning: failed to converge for protein P23721|SERC_ECOLI Warning: failed to converge for protein P22213|SLY1_YEAST Warning: failed to converge for protein P36041|EAP1_YEAST Warning: failed to converge for protein Q99257|MEX67_YEAST Warning: failed to converge for protein P30771|NAM7_YEAST Warning: failed to converge for protein Q05785|ENT2_YEAST Warning: failed to converge for protein P00431|CCPR_YEAST Warning: failed to converge for protein P0A9M0|LON_ECOLI Warning: failed to converge for protein P36016|LHS1_YEAST Warning: failed to converge for protein P33221|PURT_ECOLI Warning: failed to converge for protein P32769|HBS1_YEAST Warning: failed to converge for protein P35207|SKI2_YEAST Warning: failed to converge for protein P20485|KICH_YEAST Warning: failed to converge for protein P23883|PUUC_ECOLI Warning: failed to converge for protein P06610|BTUE_ECOLI Warning: failed to converge for protein P0A9W3|ETTA_ECOLI Warning: failed to converge for protein P0A8F0|UPP_ECOLI Warning: failed to converge for protein P38929|ATC2_YEAST Warning: failed to converge for protein P09551|ARGT_ECOLI Warning: failed to converge for protein Q05029|BCH1_YEAST Warning: failed to converge for protein P09032|EI2BG_YEAST Warning: failed to converge for protein P76621|GLAH_ECOLI Warning: failed to converge for protein P0A6Z1|HSCA_ECOLI Warning: failed to converge for protein P77804|YDGA_ECOLI Warning: failed to converge for protein P0AEG4|DSBA_ECOLI Warning: failed to converge for protein P00959|SYM_ECOLI Warning: failed to converge for protein P52489|KPYK2_YEAST Warning: failed to converge for protein P39729|RBG1_YEAST Warning: failed to converge for protein P39451|ADHP_ECOLI Warning: failed to converge for protein P40506|PPCS_YEAST Warning: failed to converge for protein Q86ZR7|YKD3A_YEAST Warning: failed to converge for protein P50086|PSD10_YEAST Warning: failed to converge for protein P0AF52|GHXP_ECOLI Warning: failed to converge for protein P39744|NOC2_YEAST Warning: failed to converge for protein Q05518|PAL1_YEAST Warning: failed to converge for protein P32474|EUG1_YEAST Warning: failed to converge for protein Q08965|BMS1_YEAST Warning: failed to converge for protein P00864|CAPP_ECOLI Warning: failed to converge for protein P17442|PHO81_YEAST Warning: failed to converge for protein P00490|PHSM_ECOLI Warning: failed to converge for protein P42945|UTP10_YEAST Warning: failed to converge for protein P53301|CRH1_YEAST Warning: failed to converge for protein P0A6W5|GREA_ECOLI Warning: failed to converge for protein P25582|SPB1_YEAST Warning: failed to converge for protein P61949|FLAV_ECOLI Warning: failed to converge for protein P43558|OTU1_YEAST Warning: failed to converge for protein P0AD33|YFCZ_ECOLI Warning: failed to converge for protein P23869|PPIB_ECOLI Warning: failed to converge for protein P53920|NM111_YEAST Warning: failed to converge for protein Q99312|ISN1_YEAST Warning: failed to converge for protein P27250|AHR_ECOLI Warning: failed to converge for protein P45577|PROQ_ECOLI Warning: failed to converge for protein P00954|SYW_ECOLI Warning: failed to converge for protein P33317|DUT_YEAST Warning: failed to converge for protein Q12515|PAR32_YEAST Warning: failed to converge for protein Q12000|TMA46_YEAST Warning: failed to converge for protein P0ACY1|YDJA_ECOLI

MatthewThe commented 3 years ago

Dear Tharan,

Thanks for reporting this issue. My suspicion is that Triqler tries to use too many cores. Could you try setting --num_threads 8 or if that doesn't work even --num_threads 1?

stharan commented 3 years ago

Thank you Matthew. That worked. Both 8 or even 16 threads were fine on my workstation.

Best,

Tharan

On Sat, Apr 3, 2021 at 2:18 AM MatthewThe @.***> wrote:

Dear Tharan,

Thanks for reporting this issue. My suspicion is that Triqler tries to use too many cores. Could you try setting --num_threads 8 or if that doesn't work even --num_threads 1?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/statisticalbiotechnology/triqler/issues/15#issuecomment-812819921, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJVRICYR7PM5N424NM2QDI3TG2XKRANCNFSM42JJOIFA .

-- Tharan Srikumar