jlevy44 / Joshua-Levy-Synteny-Analysis

Integrated Analysis of Synteny (Multiple Synteny Comparison for Multiple Alignment, Circos Outputs, Allmaps Genome Reconstructions), Find Ghost Genes, Identify and Analyze CNS Elements from Multiple Alignment Files and More!
MIT License
8 stars 1 forks source link

octoploid fail due to memory #2

Open sgordon007 opened 7 years ago

sgordon007 commented 7 years ago

[7e/4f92f5] Submitted process > genClusterMatrix_kmerPrevalence (1) FAN_r11_split.kcount.fa [93/956254] Submitted process > transform_main (2) [df/56161c] Submitted process > transform_main (1) ERROR ~ Error executing process > 'transform_main (1)'

Caused by: Process transform_main (1) terminated with an error exit status (1)

Command executed:

!/bin/bash

cd /global/projectb/scratch/sgordon/polysCRACKER/run_dirs/OctoStrawberry.250kb.k23 python subgenomeClusteringInterface.py transform_main 1 ./recluster_files/ factor 2 cosine

Command exit status: 1

Command output: (empty)

Command exit status: 1

Command output: (empty)

Command error: Traceback (most recent call last): File "subgenomeClusteringInterface.py", line 1599, in main() File "subgenomeClusteringInterface.py", line 1597, in main optionsfunct File "subgenomeClusteringInterface.py", line 365, in transform_main transform_plot((main,reclusterFolder,model,n_subgenomes,metric)) File "subgenomeClusteringInterface.py", line 445, in transform_plot data = KernelPCA(n_components=499).fit_transform(data) File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/sklearn/decomposition/kernel_pca.py", line 260, in fit_transform self.fit(X, params) File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/sklearn/decomposition/kernel_pca.py", line 236, in fit K = self._get_kernel(X) File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/sklearn/decomposition/kernel_pca.py", line 164, in _get_kernel params) File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/sklearn/metrics/pairwise.py", line 1399, in pairwise_kernels return _parallel_pairwise(X, Y, func, n_jobs, kwds) File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/sklearn/metrics/pairwise.py", line 1083, in _parallel_pairwise return func(X, Y, kwds) File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/sklearn/metrics/pairwise.py", line 735, in linear_kernel return safe_sparse_dot(X, Y.T, dense_output=True) File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/sklearn/utils/extmath.py", line 186, in safe_sparse_dot ret = ret.toarray() File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 964, in toarray return self.tocoo(copy=False).toarray(order=order, out=out) File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/scipy/sparse/coo.py", line 252, in toarray B = self._process_toarray_args(order, out) File "/global/homes/s/sgordon/.conda/envs/polysCRACKER090617/lib/python2.7/site-packages/scipy/sparse/base.py", line 1039, in _process_toarray_args return np.zeros(self.shape, dtype=self.dtype, order=order) MemoryError .command.run.1: line 99: 23466 Terminated nxf_trace "$pid" .command.trace

Work dir: /global/projectb/scratch/sgordon/polysCRACKER/run_dirs/OctoStrawberry.250kb.k23/work/df/56161cf1e10d9bb0f217bf78cc03e7

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out

-- Check '.nextflow.log' file for details WARN: Killing pending tasks (1)

jlevy44 commented 7 years ago

Seems like factor Analysis ("factor" option) should not be run on this genome (genome too large or too many scaffolds). Try just running kpca. I may be able to fix this using other dimensionality reduction techniques from outside scikit learn, but those should be implemented post-v1