cdanielmachado / carveme

CarveMe: genome-scale metabolic model reconstruction
Other
145 stars 49 forks source link

string index out of range error when running carveme in recursive mode #111

Closed shumantov closed 3 years ago

shumantov commented 3 years ago

Hi,

I'm trying to create metabolic models for all high quality genomes curated from a metagenome (MAGs), and carveme seems like the right tool for this aim. gene prediction has been done by prokka, under the metawrap pipeline. it fails, and I wonder if you could help me figure this out.

I'm using:

python 3.6.12 diamond 0.8.36 cplex 12.8.0.0

the error I get:

multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/data/bin/miniconda2/envs/carveme-v1.0/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, *kwds)) File "/data/bin/miniconda2/envs/carveme-v1.0/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/data/bin/miniconda2/envs/carveme-v1.0/bin/carve", line 391, in f recursive_mode=True File "/data/bin/miniconda2/envs/carveme-v1.0/bin/carve", line 49, in main model_id = build_model_id(model_id) File "/data/bin/miniconda2/envs/carveme-v1.0/bin/carve", line 24, in build_model_id if not model_id[0].isalpha(): IndexError: string index out of range """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/data/bin/miniconda2/envs/carveme-v1.0/bin/carve", line 395, in p.map(f, args.input) File "/data/bin/miniconda2/envs/carveme-v1.0/lib/python3.6/multiprocessing/pool.py", line 266, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/data/bin/miniconda2/envs/carveme-v1.0/lib/python3.6/multiprocessing/pool.py", line 644, in get raise self._value IndexError: string index out of range srun: error: slurm-n5: task 0: Exited with exit code 1

the command I use:

carve -r /home/ARO.local/ginatta/Projects/metawarp/high_quality_bins/ --fbc -o high_quality_models

Thank you in advance, Alon

cdanielmachado commented 3 years ago

Hi Alon, you must specify the files to process, not the name of folder, for instance:

carve -r /home/ARO.local/ginatta/Projects/metawarp/high_quality_bins/*.faa --fbc -o high_quality_models

Let me know if it works.

shumantov commented 3 years ago

Hi Daniel, Thank you for your comment :)

So I did what you said, and got rid of the 'out of range error', but stumbled upon something else, which I hope you could help me with: Out of 156 genomes, 4 models were created, and the following output was generated for the remaining 152 calls/tries:

Failed to run diamond. Unable to run diamond (make sure diamond is available in your PATH). Failed to run diamond. Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Unable to create output folder: /home/ARO.local/ginatta/Projects/metawarp/high_quality_models Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Failed to run diamond. Failed to run diamond. Unable to create output folder: /home/ARO.local/ginatta/Projects/metawarp/high_quality_models Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Unable to run diamond (make sure diamond is available in your PATH). Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond. Failed to run diamond.

this is the error I got, numerous times:

Error: Failed to create thread.

This might be related to the server on which I run carveme, and not carveme itself, though I don't really know.

Best, Alon

cdanielmachado commented 3 years ago

Hi Alon,

Recursive mode is not a good thing to use on a cluster, I will even remove this feature in future releases.

The correct approach is to use job arrays, i.e., the cluster itself launches and manages the parallel processes. I think most cluster environments (slurm, qsub, bsub) support submission of job arrays.

Here is the documentation for slurm, but it will be similar in other environments:

https://slurm.schedmd.com/job_array.html