Open Ahmed-Shibl opened 3 years ago
Quick update:
I updated R with conda install -c conda-forge R=4.0
and ran unset GREP_OPTIONS
in the environment.
Then I ran perl METABOLIC-G.pl -test true
again outside of tmux
and it was perfectly fine. No warnings or errors.
However, running perl METABOLIC-C.pl -test true
returned the following:
[2021-03-23 10:04:25] The Prodigal annotation is running...
[2021-03-23 10:05:11] The Prodigal annotation is finished
[2021-03-23 10:05:11] The hmmsearch is running with 5 cpu threads...
[2021-03-23 10:45:39] The hmmsearch is finished
[2021-03-23 10:45:42] Generating each hmm faa collection...
[2021-03-23 10:45:43] Each hmm faa collection has been made
[2021-03-23 10:45:43] The KEGG module result is calculating...
[2021-03-23 10:49:18] The KEGG identifier (KO id) result is calculating...
[2021-03-23 10:49:18] The KEGG identifier (KO id) seaching result is finished
[2021-03-23 10:49:18] Searching CAZymes by dbCAN2...
[2021-03-23 10:52:02] dbCAN2 searching is done
[2021-03-23 10:52:02] Searching MEROPS peptidase...
[2021-03-23 10:53:26] MEROPS peptidase searching is done
[2021-03-23 10:53:27] METABOLIC table has been generated
[2021-03-23 10:53:27] Drawing element cycling diagrams...
Loading required package: shape
[2021-03-23 10:56:49] Drawing element cycling diagrams finished
[2021-03-23 10:56:49] Drawing metabolic handoff diagrams...
[2021-03-23 10:56:53] Drawing metabolic handoff diagrams finished
[2021-03-23 10:56:53] Drawing energy flow chart...
Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1369.
Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1369.
Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1369.
Use of uninitialized value within %Bin2Cat in concatenation (.) or string at METABOLIC-C.pl line 1392.
Use of uninitialized value within %Bin2Cat in concatenation (.) or string at METABOLIC-C.pl line 1392.
Use of uninitialized value within %Bin2Cat in concatenation (.) or string at METABOLIC-C.pl line 1392.
Loading required package: ggplot2
Error: Must request at least one colour from a hue palette.
In addition: Warning message:
The parameter `infer.label` is deprecated.
Use `aes(label = after_stat(stratum))`.
Execution halted
Loading required package: ggplot2
Attaching package: ‘igraph’
The following objects are masked from ‘package:stats’:
decompose, spectrum
The following object is masked from ‘package:base’:
union
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
✔ tibble 3.1.0 ✔ dplyr 1.0.5
✔ tidyr 1.1.3 ✔ stringr 1.4.0
✔ readr 1.4.0 ✔ forcats 0.5.1
✔ purrr 0.3.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::as_data_frame() masks tibble::as_data_frame(), igraph::as_data_frame()
✖ purrr::compose() masks igraph::compose()
✖ tidyr::crossing() masks igraph::crossing()
✖ dplyr::filter() masks stats::filter()
✖ dplyr::groups() masks igraph::groups()
✖ dplyr::lag() masks stats::lag()
✖ purrr::simplify() masks igraph::simplify()
Attaching package: ‘tidygraph’
The following object is masked from ‘package:igraph’:
groups
The following object is masked from ‘package:stats’:
filter
Error: Must request at least one colour from a hue palette.
Execution halted
[2021-03-23 10:56:56] Drawing energy flow chart finished
[2021-03-23 10:56:56] Calculating MN-score ...
Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1508.
Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1508.
Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1508.
Use of uninitialized value within %Bin2Cat in concatenation (.) or string at METABOLIC-C.pl line 1532.
Use of uninitialized value within %Bin2Cat in concatenation (.) or string at METABOLIC-C.pl line 1532.
Use of uninitialized value within %Bin2Cat in concatenation (.) or string at METABOLIC-C.pl line 1532.
[2021-03-23 10:56:56] Calculating MN-score is done
Hi Ahmed,
I had the same issues. The first one: Use of uninitialized value in concatenation (.) or string at METABOLIC-G.pl line 1373.
is a perl issue. However, following #27 all issues disappeared for me. Check carefully that all required packages were installed correctly. Unfortunately, conda has issues with Perl. For example, installation of array::split previously exited with some compilation error.
IIRC, the second one: Error: Must request at least one colour from a hue palette.
is an R problem. METABOLIC uses R version 3.x while you have 4.x, in 4.x they changed default behavior for function read.table(), specifically, the function loads strings as factors in 3.x and as characters in 4.x. ggalluvial in METABOLIC script expects factors. Run the older version of R or you can change the R script with read.table(stringsAsFactors=TRUE).
Hope it helps. Michal
Hi @strejcem Thanks for your input! I've followed the conda installation word for word and I'm still getting the same perl-related error:
[2021-04-26 14:06:01] Drawing metabolic handoff diagrams finished
[2021-04-26 14:06:01] Drawing energy flow chart...
Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1374.
Use of uninitialized value within %Bin2Cat in concatenation (.) or string at METABOLIC-C.pl line 1397.
Use of uninitialized value within %Bin2Cat in concatenation (.) or string at METABOLIC-C.pl line 1397.
These are the line numbers 1362-1400 in METABOLIC-C.pl
:
my %Hash_gn_n_pth = ();
my %Total_R_community_coverage = (); # genome\tpathway => category \t pathway \t genome coverage percentage
if ($omic_reads_parameters){
my %Genome_cov = %Genome_cov_constant;
#%Total_R_input pathway => gn => 1 or 0
foreach my $pth (sort keys %Total_R_input){
my $gn_cov_percentage = 0;
foreach my $gn (sort keys %Hmmscan_result){
if ($Genome_cov{$gn} and $Total_R_input{$pth}{$gn}){
$gn_cov_percentage = $Genome_cov{$gn};
my $cat = $Bin2Cat{$gn};
my $gn_n_pth = "$gn\t$pth"; $Hash_gn_n_pth{$gn_n_pth} = 1;
$Total_R_community_coverage{$gn_n_pth} = "$cat\t$pth\t$gn_cov_percentage";
}
}
}
}
my %Total_R_community_coverage2 = (); #$genome\tpath pair => cat \t coverage percentage average
foreach my $gn (sort keys %Hmmscan_result){
my %Path = (); # path => 1
foreach my $gn_n_pth (sort keys %Total_R_community_coverage){
if ($gn_n_pth =~ /$gn\t/){
my @tmp = split (/\t/,$gn_n_pth);
$Path{$tmp[1]} = 1;
}
}
my @Path_keys = sort keys %Path;
for(my $i=0; $i<=$#Path_keys; $i++){
for(my $j = $i+1; $j<=$#Path_keys; $j++){
my $pair = "$Path_keys[$i]\t$Path_keys[$j]";
my $coverage = 0;
my @tmp1 = split (/\t/, $Total_R_community_coverage{"$gn\t$Path_keys[$i]"});
my @tmp2 = split (/\t/, $Total_R_community_coverage{"$gn\t$Path_keys[$j]"});
$coverage = ($tmp1[2] + $tmp2[2]) / 2;
$Total_R_community_coverage2{"$gn\t$pair"} = $Bin2Cat{$gn}."\t".$coverage;
}
}
}
I wonder if it's a bug in the script. I'm going to try and tackle the R issue next.
Hi @Ahmed-Shibl, it seems that something is wrong with "$cat" (means category). I am wondering whether the GTDB-tk worked properly
Hi @Ahmed-Shibl, it seems that something is wrong with "$cat" (means category). I am wondering whether the GTDB-tk worked properly
Hi @ChaoLab, the output of gtdbtk check_install
is:
[2021-04-27 21:57:01] INFO: GTDB-Tk v1.4.1
[2021-04-27 21:57:01] INFO: gtdbtk check_install
[2021-04-27 21:57:01] INFO: Using GTDB-Tk reference data version r95: ~/miniconda3/envs/2metabolic/release95/
[2021-04-27 21:57:01] INFO: Running install verification
[2021-04-27 21:57:01] INFO: Checking that all third-party software are on the system path:
[2021-04-27 21:57:01] INFO: |-- FastTree OK
[2021-04-27 21:57:01] INFO: |-- FastTreeMP OK
[2021-04-27 21:57:01] INFO: |-- fastANI OK
[2021-04-27 21:57:01] INFO: |-- guppy OK
[2021-04-27 21:57:01] INFO: |-- hmmalign OK
[2021-04-27 21:57:01] INFO: |-- hmmsearch OK
[2021-04-27 21:57:01] INFO: |-- mash OK
[2021-04-27 21:57:01] INFO: |-- pplacer OK
[2021-04-27 21:57:01] INFO: |-- prodigal OK
[2021-04-27 21:57:01] INFO: Checking integrity of reference package: ~/miniconda3/envs/2metabolic/release95/
[2021-04-27 21:57:03] INFO: |-- pplacer OK
[2021-04-27 21:57:03] INFO: |-- masks OK
[2021-04-27 21:57:04] INFO: |-- markers OK
[2021-04-27 21:57:04] INFO: |-- radii OK
[2021-04-27 21:57:11] INFO: |-- msa OK
[2021-04-27 21:57:11] INFO: |-- metadata OK
[2021-04-27 21:57:11] INFO: |-- taxonomy OK
and gtdbtk test
also runs fine..
@Ahmed-Shibl
You should also check if all the MAG were classified by GTDBtk. I think METABOLICC looks for classification at Phylum level, if there is none for a MAG(s) the $cat variable might end up empty and throw errors. Just a thought.
Hi @patriciatran and @ChaoLab,
I've recently re-installed METABOLIC using conda [https://github.com/AnantharamanLab/METABOLIC/issues/27] and
git clone https://github.com/AnantharamanLab/METABOLIC.git
in a new environment. When I tried running the command with the test dataset, I got some errors that I assume are perl-related.This is the command I used:
perl METABOLIC-G.pl -test true
And this was the output + errors/warnings:
Please let me know if you need any additional information - thanks in advance! Looking forward to re-running this smoothly and applying it to my datasets.