Closed janprovaz closed 2 months ago
Hi Jan, it seems to be an issue with the R version and pyRserve. Can you please check the R version installed in your environmnet? Try to downgrade it to version 4.0.3 (this is the version we are currently running on our machines) and re-install the needed R packages (you find them in the environment.yml, all starting with r-*). This hopfully should solve the issue.
Best, Ludwig
024-08-25 20:08:21,514 - lib.seqtools - INFO - saving chunk /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.4
2024-08-25 20:08:21,514 - lib.seqtools - INFO - running all to all blast
Process 0: Traceback (most recent call last): File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, self._kwargs) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/parallel/parallel.py", line 79, in fun pipe.send(f(x)) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/parallel/parallel.py", line 294, in command_star return(command(args)) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/seqtools.py", line 34, in _hitsort_worker with subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE) as p: File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/subprocess.py", line 800, in init restore_signals, start_new_session) File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/subprocess.py", line 1551, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'mgblast': 'mgblast' Process 1: Traceback (most recent call last): File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(self._args, self._kwargs) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/parallel/parallel.py", line 79, in fun pipe.send(f(x)) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/parallel/parallel.py", line 294, in command_star return(command(args)) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/seqtools.py", line 34, in _hitsort_worker with subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE) as p: File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/subprocess.py", line 800, in init restore_signals, start_new_session) File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/subprocess.py", line 1551, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'mgblast': 'mgblast' Process 3: Process 2: Traceback (most recent call last): File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(self._args, self._kwargs) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/parallel/parallel.py", line 79, in fun pipe.send(f(x)) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/parallel/parallel.py", line 294, in command_star return(command(args)) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/seqtools.py", line 34, in _hitsort_worker with subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE) as p: File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/subprocess.py", line 800, in init restore_signals, start_new_session) File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/subprocess.py", line 1551, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'mgblast': 'mgblast' Traceback (most recent call last): File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(self._args, self._kwargs) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/parallel/parallel.py", line 79, in fun pipe.send(f(x)) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/parallel/parallel.py", line 294, in command_star return(command(*args)) File "/home/lxs/miniconda3/envs/eccsplorer/bin/repex_tarean/lib/seqtools.py", line 34, in _hitsort_worker with subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE) as p: File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/subprocess.py", line 800, in init restore_signals, start_new_session) File "/home/lxs/miniconda3/envs/eccsplorer/lib/python3.7/subprocess.py", line 1551, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'mgblast': 'mgblast'
2024-08-25 20:08:21,531 - lib.seqtools - INFO - all to all blast finished
2024-08-25 20:08:21,531 - lib.seqtools - INFO - removing duplicates from all to all blast results
2024-08-25 20:08:21,538 - lib.graphtools - INFO - converting hitsort to binary format
2024-08-25 20:08:21,544 - lib.graphtools - INFO - running louvain clustering...
Traceback (most recent call last):
File "/home/lxs/miniconda3/envs/eccsplorer/bin/seqclust", line 821, in
Building a new DB, current time: 08/25/2024 20:08:21 New DB name: /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta New DB title: /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta Sequence type: Nucleotide Keep Linkouts: T Keep MBits: T Maximum file size: 1000000000B Adding sequences from FASTA; added 4000 sequences in 0.036247 seconds.
Building a new DB, current time: 08/25/2024 20:08:21 New DB name: /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta New DB title: /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta Sequence type: Nucleotide Deleted existing Nucleotide BLAST database named /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta Keep Linkouts: T Keep MBits: T Maximum file size: 1000000000B Adding sequences from FASTA; added 4000 sequences in 0.036865 seconds. Trying to start Rserve... connection OK R function loaded: add_preamble capitalize connect_to_databases create_main_reports df2html disconnect_database dummy_function get_comparative_codes is_comparative nested2named_list plot_rect_map preformatted rectMap reformat_df_report reformat_df_to_profrep_classification reformat_header reformat4html start_html summary_histogram R function loaded: add_leaves_value add_preamble add_value_to_nodes annot2colors cluster_annotation common_ancestor connect_to_databases containLTR create_all_superclusters_report create_cluster_report create_single_supercluster_report df2html disconnect_database evaluate_LTR_detection filter_tree filter_tree2 find_best_hit find_best_hit_repeat format_clinfo format_tree formatWidth get_annotation_groups get_cluster_annotation_summary get_cluster_comparative_counts get_cluster_connection_info get_cluster_info get_comparative_codes get_ltr_info get_reads_annotation get_supercluster_graph get_supercluster_info get_supercluster_summary get_tarean_info html_insert_floating_image html_insert_image is_comparative make_final_annotation_template nested2named_list pasteDomains pieScatter plot_edges plot_rect_map plot_supercluster plotg preformatted radius_size read_annotation_to_tree rectMap rescale select_reads_id start_html summarize_annotation summary_histogram supercluster_size trmap running in parallel using 16 cpu(s) mgblast -p 75 -W18 -UT -X40 -KT -JF -F "m D" -v100000000 -b100000000 -D4 -C 30 -H 30 -i /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.0 -d /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.legacy mgblast -p 75 -W18 -UT -X40 -KT -JF -F "m D" -v100000000 -b100000000 -D4 -C 30 -H 30 -i /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.1 -d /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.legacy mgblast -p 75 -W18 -UT -X40 -KT -JF -F "m D" -v100000000 -b100000000 -D4 -C 30 -H 30 -i /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.3 -d /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.legacy mgblast -p 75 -W18 -UT -X40 -KT -JF -F "m D" -v100000000 -b100000000 -D4 -C 30 -H 30 -i /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.2 -d /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.legacy job finished with exit code 1 job finished with exit code 1 job finished with exit code 1 job finished with exit code 1 ['louvain_convert', '-i', '/home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.blast.int', '-o', '/home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.blast.int.bin', '-w', '/home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/seqclust/prerun/sample.fasta.blast.int.weight'] Shutting down Rserv...Done
2024-08-25 20:08:23,063 - [cluster_coordinator] INFO: Summarizing clustering results.
sed:无法读取 /home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/testrun/eccpipe_results/clustering_results/CLUSTERTABLE.csv:没有那个文件或目录
/home/lxs/miniconda3/envs/eccsplorer/bin/ECCsplorer/lib/eccClusterer.py:55: DeprecationWarning: np.str
is a deprecated alias for the builtin str
. To silence this warning, use str
by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.str` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
dtype=np.str, delimiter='\t', skiprows=1)
Traceback (most recent call last):
File "/home/lxs/miniconda3/envs/eccsplorer/bin/eccsplorer", line 815, in
Dear @lxs524 , it seems you are using two conda environments for RE2 and eccsplorer. Make sure that mgblast is also available within the eccsplorer environment. Follow the detailed installation instructions to install all RE2 dependencies within the eccsplorer environment.
Note that in its current implementation the eccsplorer pipeline is not meant to activate other conda environments.
Dear @crimBubble, first of all thank you for writing this software :) I would like to ask you for help regarding running the test data.
When I run the script as instructed (with addition of
-cpu 10
to prevent previously mentioned "index out of bounds" errors) I get this:This happens both on my test and my real data, I tried clean output folders. Are there specific versions of numpy and pyRserve that I should be running instead?
Thank you very much for your help and time, Jan