Zymo-Research / figaro

An efficient and objective tool for optimizing microbiome rRNA gene trimming parameters
GNU General Public License v3.0
80 stars 25 forks source link

Figaro crashing #20

Closed jarrodscott closed 3 years ago

jarrodscott commented 4 years ago

Hi there!

I am having some problems running figaro and figuring out what the problem is. I looked on SO but didn't find anything useful and my knowledge of python is limited :) Thanks!

I am running on 20 threads with 600GB of memory. Should be plenty?

Here is the error.

+ Sun Oct 18 18:17:06 EDT 2020 job job_01_figaro started in sThC.q with jobID=17945402 on compute-81-24
+ NSLOTS = 20
+ Sun Oct 18 18:20:50 EDT 2020 job job_01_figaro started in sThM.q with jobID=17945405 on compute-75-02
+ NSLOTS = 20
+ Sun Oct 18 18:22:05 EDT 2020 job job_01_figaro started in sThM.q with jobID=17945407 on compute-00-01
+ NSLOTS = 20
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/miniconda3/envs/figaro/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/miniconda3/envs/figaro/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/miniconda3/envs/figaro/figaro/figaroSupport/expectedErrorCurve.py", line 93, in calculateAverageExpectedError
    percentileExpectedError = makeExpectedErrorPercentileArrayForFastq(fastq.filePath, self.subsample, self.percentile, self.primerLength)
  File "/miniconda3/envs/figaro/figaro/figaroSupport/expectedErrorCurve.py", line 105, in makeExpectedErrorPercentileArrayForFastq
    expectedErrorMatrix = fastqAnalysis.buildExpectedErrorMatrix(path, subsample=subsample, leftTrim=primerLength)
  File "/miniconda3/envs/figaro/figaro/figaroSupport/fastqAnalysis.py", line 33, in buildExpectedErrorMatrix
    return numpy.array(expectedErrorMatrix, dataType, order='F')
ValueError: setting an array element with a sequence.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "figaro.py", line 213, in <module>
    resultTable, forwardCurve, reverseCurve = figaroSupport.trimParameterPrediction.performAnalysisLite(parameters.inputDirectory.value, parameters.minimumCombinedReadLength.value, subsample =  parameters.subsample.value, percentile = parameters.percentile.value, forwardPrimerLength=parameters.forwardPrimerLength.value, reversePrimerLength=parameters.reversePrimerLength.value, namingStandardAlias=fileNamingStandard)
  File "/miniconda3/envs/figaro/figaro/figaroSupport/trimParameterPrediction.py", line 439, in performAnalysisLite
    forwardCurve, reverseCurve = expectedErrorCurve.calculateExpectedErrorCurvesForFastqList(fastqList, subsample=subsample, percentile=percentile, makePNG=makeExpectedErrorPlots, forwardPrimerLength=forwardPrimerLength, reversePrimerLength=reversePrimerLength)
  File "/miniconda3/envs/figaro/figaro/figaroSupport/expectedErrorCurve.py", line 168, in calculateExpectedErrorCurvesForFastqList
    forwardExpectedErrorArray = makeExpectedErrorPercentileArrayForFastqList(forwardFastqs, subsample, percentile, forwardPrimerLength)
  File "/miniconda3/envs/figaro/figaro/figaroSupport/expectedErrorCurve.py", line 115, in makeExpectedErrorPercentileArrayForFastqList
    expectedErrorReturns = easyMultiprocessing.parallelProcessRunner(parallelAgent.calculateAverageExpectedError, fastqList)
  File "/miniconda3/envs/figaro/figaro/figaroSupport/easyMultiprocessing.py", line 68, in parallelProcessRunner
    return mapper(processor, itemsToProcess, chunkSize)
  File "/miniconda3/envs/figaro/lib/python3.6/multiprocessing/pool.py", line 288, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/miniconda3/envs/figaro/lib/python3.6/multiprocessing/pool.py", line 670, in get
    raise self._value
ValueError: setting an array element with a sequence.
= Sun Oct 18 18:22:09 EDT 2020 job job_01_figaro don

And here is my install info:

>python --version
Python 3.6.7

> pip list 
Package         Version
--------------- -------------------
certifi         2020.6.20
cycler          0.10.0
kiwisolver      1.2.0
matplotlib      3.0.2
numpy           1.13.1
pip             20.2.4
pyparsing       2.4.7
python-dateutil 2.8.1
scipy           1.2.1
setuptools      49.6.0.post20201009
six             1.15.0
wheel           0.35.1
michael-weinstein commented 4 years ago

Are you able to/have you tried to run it in its Docker?

Sent from my iPhone

On Oct 18, 2020, at 3:30 PM, Jarrod notifications@github.com wrote:

 Hi there!

I am having some problems running figaro and figuring out what the problem is. I looked on SO but didn't find anything useful and my knowledge of python is limited :) Thanks!

I am running on 20 threads with 600GB of memory. Should be plenty?

Here is the error.

  • Sun Oct 18 18:17:06 EDT 2020 job job_01_figaro started in sThC.q with jobID=17945402 on compute-81-24
  • NSLOTS = 20
  • Sun Oct 18 18:20:50 EDT 2020 job job_01_figaro started in sThM.q with jobID=17945405 on compute-75-02
  • NSLOTS = 20
  • Sun Oct 18 18:22:05 EDT 2020 job job_01_figaro started in sThM.q with jobID=17945407 on compute-00-01
  • NSLOTS = 20 multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/miniconda3/envs/figaro/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, *kwds)) File "/miniconda3/envs/figaro/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/miniconda3/envs/figaro/figaro/figaroSupport/expectedErrorCurve.py", line 93, in calculateAverageExpectedError percentileExpectedError = makeExpectedErrorPercentileArrayForFastq(fastq.filePath, self.subsample, self.percentile, self.primerLength) File "/miniconda3/envs/figaro/figaro/figaroSupport/expectedErrorCurve.py", line 105, in makeExpectedErrorPercentileArrayForFastq expectedErrorMatrix = fastqAnalysis.buildExpectedErrorMatrix(path, subsample=subsample, leftTrim=primerLength) File "/miniconda3/envs/figaro/figaro/figaroSupport/fastqAnalysis.py", line 33, in buildExpectedErrorMatrix return numpy.array(expectedErrorMatrix, dataType, order='F') ValueError: setting an array element with a sequence. """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "figaro.py", line 213, in resultTable, forwardCurve, reverseCurve = figaroSupport.trimParameterPrediction.performAnalysisLite(parameters.inputDirectory.value, parameters.minimumCombinedReadLength.value, subsample = parameters.subsample.value, percentile = parameters.percentile.value, forwardPrimerLength=parameters.forwardPrimerLength.value, reversePrimerLength=parameters.reversePrimerLength.value, namingStandardAlias=fileNamingStandard) File "/miniconda3/envs/figaro/figaro/figaroSupport/trimParameterPrediction.py", line 439, in performAnalysisLite forwardCurve, reverseCurve = expectedErrorCurve.calculateExpectedErrorCurvesForFastqList(fastqList, subsample=subsample, percentile=percentile, makePNG=makeExpectedErrorPlots, forwardPrimerLength=forwardPrimerLength, reversePrimerLength=reversePrimerLength) File "/miniconda3/envs/figaro/figaro/figaroSupport/expectedErrorCurve.py", line 168, in calculateExpectedErrorCurvesForFastqList forwardExpectedErrorArray = makeExpectedErrorPercentileArrayForFastqList(forwardFastqs, subsample, percentile, forwardPrimerLength) File "/miniconda3/envs/figaro/figaro/figaroSupport/expectedErrorCurve.py", line 115, in makeExpectedErrorPercentileArrayForFastqList expectedErrorReturns = easyMultiprocessing.parallelProcessRunner(parallelAgent.calculateAverageExpectedError, fastqList) File "/miniconda3/envs/figaro/figaro/figaroSupport/easyMultiprocessing.py", line 68, in parallelProcessRunner return mapper(processor, itemsToProcess, chunkSize) File "/miniconda3/envs/figaro/lib/python3.6/multiprocessing/pool.py", line 288, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/miniconda3/envs/figaro/lib/python3.6/multiprocessing/pool.py", line 670, in get raise self._value ValueError: setting an array element with a sequence. = Sun Oct 18 18:22:09 EDT 2020 job job_01_figaro don And here is my install info:

python --version Python 3.6.7

pip list Package Version


certifi 2020.6.20 cycler 0.10.0 kiwisolver 1.2.0 matplotlib 3.0.2 numpy 1.13.1 pip 20.2.4 pyparsing 2.4.7 python-dateutil 2.8.1 scipy 1.2.1 setuptools 49.6.0.post20201009 six 1.15.0 wheel 0.35.1

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

jarrodscott commented 4 years ago

I am running on a server and don't use docker. But I will try now. Thanks for the quick response.

michael-weinstein commented 4 years ago

Definitely. Let me know how it goes. One condition I don't think I tested is having more cores than files, but it should be ok.

Out of curiosity, what are you studying?

jarrodscott commented 4 years ago

This study is on marine sediment microbes. I will definitely let you know. I need to ask the cluster manager about setting up docker on the server. I have been unsuccessful so far :) Thanks for your help!

michael-weinstein commented 4 years ago

Interesting. Let me know if you have trouble getting Docker. Your admin may prefer you use singularity, as that has better security on shared systems.

Are you looking for sulfur-eaters or other lithotrophs?

Sent from my iPhone

On Oct 18, 2020, at 4:46 PM, Jarrod notifications@github.com wrote:

 This study is on marine sediment microbes. I will definitely let you know. I need to ask the cluster manager about setting up docker on the server. I have been unsuccessful so far :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.