AnantharamanLab / METABOLIC

A scalable high-throughput metabolic and biogeochemical functional trait profiler
177 stars 44 forks source link

encounter the BlockingIOError #144

Open heyinghui22 opened 1 year ago

heyinghui22 commented 1 year ago

hello, when I use the metabolic-C, i encounter some mistakes. I ran the code on linux (System: CentOS) Here is the code I ran perl /mnt/data/software/miniconda3/envs/metabolic/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl -in-gn /mnt/data1/project/HYHseason230215/19_metabolic/1_test/1_mag_fa \ -r /mnt/data1/project/HYHseason230215/19_metabolic/1_test/1_MG -t 80 -o /mnt/data1/project/HYHseason230215/19_metabolic/1_test/2_METABOLIC-C_output

My clean data is in this folder: /mnt/data1/project/HYHseason230215/19_metabolic/1_test/1_MG My MAGs are in this folder: /mnt/data1/project/HYHseason230215/19_metabolic/1_test/1_mag_fa

and here is the error I met:

`[2023-05-11 00:00:27] INFO: Identifying markers in 26 genomes with 80 threads. [2023-05-11 00:00:27] TASK: Running Prodigal V2.6.3 to identify genes. [2023-05-11 00:00:28] INFO: Completed 26 genomes in 0.84 seconds (31.14 genomes/second). [2023-05-11 00:00:28] WARNING: Prodigal skipped 26 genomes due to pre-existing data, see warnings.log [2023-05-11 00:00:28] TASK: Identifying TIGRFAM protein families. [2023-05-11 00:00:28] INFO: Completed 26 genomes in 0.74 seconds (35.29 genomes/second). [2023-05-11 00:00:28] WARNING: TIGRFAM skipped 26 genomes due to pre-existing data, see warnings.log [2023-05-11 00:00:28] TASK: Identifying Pfam protein families. [2023-05-11 00:00:29] INFO: Completed 26 genomes in 0.56 seconds (46.52 genomes/second). [2023-05-11 00:00:29] WARNING: Pfam skipped 26 genomes due to pre-existing data, see warnings.log [2023-05-11 00:00:29] INFO: Annotations done using HMMER 3.1b2 (February 2015). [2023-05-11 00:00:29] TASK: Summarising identified marker genes. [2023-05-11 00:00:33] INFO: Completed 26 genomes in 4.24 seconds (6.14 genomes/second). [2023-05-11 00:00:33] INFO: Done. [2023-05-11 00:00:34] INFO: Aligning markers in 420 genomes with 80 CPUs. [2023-05-11 00:00:34] INFO: Processing 25 genomes identified as bacterial. [2023-05-11 00:01:11] INFO: Read concatenated alignment for 45,555 GTDB genomes. [2023-05-11 00:01:11] TASK: Generating concatenated alignment for each marker. [2023-05-11 00:01:22] ERROR: Uncontrolled exit resulting from an unexpected error.

================================================================================ EXCEPTION: BlockingIOError MESSAGE: [Errno 11] Resource temporarily unavailable Traceback (most recent call last): File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/site-packages/gtdbtk/main.py", line 95, in main gt_parser.parse_options(args) File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/site-packages/gtdbtk/main.py", line 750, in parse_option s self.align(options) File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/site-packages/gtdbtk/main.py", line 293, in align markers.align(options.identify_dir, File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/site-packages/gtdbtk/markers.py", line 520, in align user_msa = align.align_marker_set(cur_genome_files, marker_info_file, copy_number_f, self.cpus) File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/site-packages/gtdbtk/pipeline/align.py", line 219, in al ign_marker_set single_copy_hits = get_single_copy_hits(gid_dict, copy_number_file, cpus) File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/site-packages/gtdbtk/pipeline/align.py", line 78, in get _single_copy_hits with mp.get_context('spawn').Pool(processes=cpus) as pool: File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/context.py", line 119, in Pool return Pool(processes, initializer, initargs, maxtasksperchild, File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/pool.py", line 212, in init self._repopulate_pool() File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/pool.py", line 303, in _repopulate_pool return self._repopulate_pool_static(self._ctx, self.Process, File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/pool.py", line 326, in _repopulatepool static w.start() File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/popen_spawn_posix.py", line 32, in ini t super().init(process_obj) File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/popen_spawn_posix.py", line 58, in _laun ch self.pid = util.spawnv_passfds(spawn.get_executable(), File "/mnt/data/software/miniconda3/envs/metabolic/lib/pypy3.8/multiprocessing/util.py", line 452, in spawnv_passfds return _posixsubprocess.fork_exec( BlockingIOError: [Errno 11] Resource temporarily unavailable

mv: 无法获取"/mnt/data1/project/HYHseason230215/19_metabolic/1_test/2_METABOLIC-C_output/Output_energy_flow/Energy_plot/n etwork.plot.pdf" 的文件状态(stat): 没有那个文件或目录 mv: 无法获取"/mnt/data1/project/HYHseason230215/19_metabolic/1_test/2_METABOLIC-C_output/Output_energy_flow/Energy_plot/n etwork.plot.pdf" 的文件状态(stat): 没有那个文件或目录 (no such file or directory) [2023-05-11 00:01:33] Drawing energy flow chart finished [2023-05-11 00:01:33] Calculating MW-score ... mkdir: 无法创建目录"/mnt/data1/project/HYHseason230215/19_metabolic/1_test/2_METABOLIC-C_output/MW-score_result": 文件已 存在 (cannot create the directory : file already exists) mkdir: 无法创建目录"/mnt/data1/project/HYHseason230215/19_metabolic/1_test/2_METABOLIC-C_output/MW-score_result": 文件已 存在 [2023-05-11 00:01:35] Calculating MW-score is done METABOLIC-C was done, the total running time: 02:04:31 (hh:mm:ss)

and here is the complete log file: METABOLIC_log.log

Thank you very much and looking forward to your reply!

ChaoLab commented 1 year ago

Hi, There should be a txt file following the -r option (It defines the path to a text file containing the location of paried reads). There seemed to have an error in the GTDB-Tk running. It is suggested to check GTDB-Tk independently before running METABOLIC-C. Did you run the same script before? It seems that the previous folder 2_METABOLIC-C_output was still there. It is suggested to first delete the previous folder when initiating a new run