bxlab / metaWRAP

MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis
MIT License
390 stars 191 forks source link

can't plot in plot_binning_results.py #349

Open suqian-github opened 3 years ago

suqian-github commented 3 years ago

When I use metawrap bin_refinement, something wrong happened in plot .eps and .png, detailed information is below:

------------------------------------------------------------------------------------------------------------------------
-----            There is 40 RAM and 50 threads available, and each pplacer thread uses >40GB, so I will           -----
-----                                          use 1 threads for pplacer                                           -----
------------------------------------------------------------------------------------------------------------------------

########################################################################################################################
#####                                                BEGIN PIPELINE!                                               #####
########################################################################################################################

------------------------------------------------------------------------------------------------------------------------
-----                              setting up output folder and copything over bins...                             -----
------------------------------------------------------------------------------------------------------------------------

************************************************************************************************************************
*****                       Warning: test.Bin_refinement already exists. Attempting to clean.                      *****
************************************************************************************************************************

rm: 无法删除"test.Bin_refinement/binsA": 没有那个文件或目录
rm: 无法删除"test.Bin_refinement/binsB": 没有那个文件或目录
rm: 无法删除"test.Bin_refinement/binsC": 没有那个文件或目录
rm: 无法删除"test.Bin_refinement/binsAB": 没有那个文件或目录
rm: 无法删除"test.Bin_refinement/binsBC": 没有那个文件或目录
rm: 无法删除"test.Bin_refinement/binsAC": 没有那个文件或目录
rm: 无法删除"test.Bin_refinement/binsABC": 没有那个文件或目录
rm: 无法删除"test.Bin_refinement/bin.*": 没有那个文件或目录

------------------------------------------------------------------------------------------------------------------------
-----                                          there are 33 bins in binsA                                          -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                                          there are 24 bins in binsB                                          -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                                           there are 1 bins in binsC                                          -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                                             There are 3 bin sets!                                            -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----           Previous bin refinment files found. If this was not intended, please re-run with a clear           -----
-----                                   output directory. Skipping refinement...                                   -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                            fixing bin naming to .fa convention for consistancy...                            -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----           Previous bin refinement files found. If this was not intended, please re-run with a clear          -----
-----                                  output directory. Skipping CheckM runs...                                   -----
------------------------------------------------------------------------------------------------------------------------

########################################################################################################################
#####                      CONSOLIDATING ALL BIN SETS BY CHOOSING THE BEST VERSION OF EACH BIN                     #####
########################################################################################################################

------------------------------------------------------------------------------------------------------------------------
-----                           There are 3 original bin folders, plus the refined bins.                           -----
------------------------------------------------------------------------------------------------------------------------

rm: 无法删除"binsM": 没有那个文件或目录
rm: 无法删除"binsM.stats": 没有那个文件或目录

------------------------------------------------------------------------------------------------------------------------
-----                                        merging binsABC.stats and binsM                                       -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 4 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                        merging binsAB.stats and binsM                                        -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 4 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                        merging binsAC.stats and binsM                                        -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 4 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                         merging binsA.stats and binsM                                        -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 4 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                        merging binsBC.stats and binsM                                        -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 6 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                         merging binsB.stats and binsM                                        -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 6 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                         merging binsC.stats and binsM                                        -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 6 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                     merging concoct_bins.stats and binsM                                     -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 6 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                     merging maxbin2_bins.stats and binsM                                     -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 6 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                     merging metabat2_bins.stats and binsM                                    -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 6 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----                                  merging metawrap_70_5_bins.stats and binsM                                  -----
------------------------------------------------------------------------------------------------------------------------

Loading list of good bins (comp>70.0%, cont<5.0%)
load in the info about the contigs in each bin...
make all bossible comparisons between the two bin sets, and record total % idential length
load in completion and contamination scores of all the bins
go through first group, pull out identical bins from second group, and choose best
retrieve bins from second group that were not found in first group
There were 6 bins cherry-picked from the original sets!

------------------------------------------------------------------------------------------------------------------------
-----             Scanning to find duplicate contigs between bins and only keep them in the best bin...            -----
------------------------------------------------------------------------------------------------------------------------

Loading in bin completion and contamination scores...
Loading in contigs in each bin...
Making a new dereplicated version of each bin file

------------------------------------------------------------------------------------------------------------------------
-----                     You will find the best non-reassembled versions of the bins in binsO                     -----
------------------------------------------------------------------------------------------------------------------------

########################################################################################################################
#####                                          FINALIZING THE REFINED BINS                                         #####
########################################################################################################################

------------------------------------------------------------------------------------------------------------------------
-----                                        Re-running CheckM on binsO bins                                       -----
------------------------------------------------------------------------------------------------------------------------

*******************************************************************************
 [CheckM - tree] Placing bins in reference genome tree.
*******************************************************************************

  Identifying marker genes in 5 bins with 50 threads:
    Finished processing 5 of 5 (100.00%) bins.
  Saving HMM info to file.

  Calculating genome statistics for 5 bins with 50 threads:
    Finished processing 5 of 5 (100.00%) bins.

  Extracting marker genes to align.
  Parsing HMM hits to marker genes:
    Finished parsing hits for 5 of 5 (100.00%) bins.
  Extracting 43 HMMs with 50 threads:
    Finished extracting 43 of 43 (100.00%) HMMs.
  Aligning 43 marker genes with 50 threads:
    Finished aligning 43 of 43 (100.00%) marker genes.

  Reading marker alignment files.
  Concatenating alignments.
  Placing 5 bins into the genome tree with pplacer (be patient).

  { Current stage: 0:04:58.054 || Total: 0:04:58.054 }

*******************************************************************************
 [CheckM - lineage_set] Inferring lineage-specific marker sets.
*******************************************************************************

  Reading HMM info from file.
  Parsing HMM hits to marker genes:
    Finished parsing hits for 5 of 5 (100.00%) bins.

  Determining marker sets for each genome bin.
    Finished processing 5 of 5 (100.00%) bins (current: bin.5).

  Marker set written to: binsO.checkm/lineage.ms

  { Current stage: 0:00:01.428 || Total: 0:04:59.482 }

*******************************************************************************
 [CheckM - analyze] Identifying marker genes in bins.
*******************************************************************************

  Identifying marker genes in 5 bins with 50 threads:
    Finished processing 5 of 5 (100.00%) bins.
  Saving HMM info to file.

  { Current stage: 0:01:06.443 || Total: 0:06:05.926 }

  Parsing HMM hits to marker genes:
    Finished parsing hits for 5 of 5 (100.00%) bins.
  Aligning marker genes with multiple hits in a single bin:
    Finished processing 5 of 5 (100.00%) bins.

  { Current stage: 0:00:03.265 || Total: 0:06:09.191 }

  Calculating genome statistics for 5 bins with 50 threads:
    Finished processing 5 of 5 (100.00%) bins.

  { Current stage: 0:00:00.383 || Total: 0:06:09.575 }

*******************************************************************************
 [CheckM - qa] Tabulating genome statistics.
*******************************************************************************

  Calculating AAI between multi-copy marker genes.

  Reading HMM info from file.
  Parsing HMM hits to marker genes:
    Finished parsing hits for 5 of 5 (100.00%) bins.

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Bin Id            Marker lineage           # genomes   # markers   # marker sets   0     1    2    3   4   5+   Completeness   Contamination   Strain heterogeneity
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
  bin.2    f__Bifidobacteriaceae (UID1458)       77         464           220        0    461   3    0   0   0       100.00           0.53              33.33
  bin.3      f__Lachnospiraceae (UID1286)        57         420           207        1    416   3    0   0   0       99.52            1.21               0.00
  bin.4       g__Streptococcus (UID684)          26         668           229        23   631   14   0   0   0       95.93            1.90              64.29
  bin.1       o__Clostridiales (UID1226)        155         278           158        13   258   7    0   0   0       94.04            1.60              42.86
  bin.5      f__Lachnospiraceae (UID1256)        33         333           171        36   294   3    0   0   0       87.43            0.76               0.00
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

  { Current stage: 0:00:02.829 || Total: 0:06:12.405 }

------------------------------------------------------------------------------------------------------------------------
-----            There are 5 'good' bins found in binsO.checkm! (>70% completion and <5% contamination)            -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                                 Removing bins that are inadequate quality...                                 -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----           Re-evaluating bin quality after contig de-replication is complete! There are still 5 high          -----
-----                                                quality bins.                                                 -----
------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
-----                making completion and contamination ranking plots for all refinement iterations               -----
------------------------------------------------------------------------------------------------------------------------

Loading completion info....
Plotting completion data...
Loading contamination info...
Plotting the contamination data...
Traceback (most recent call last):
  File "/home/suq/miniconda3/envs/metawrap/bin/metawrap-scripts/plot_binning_results.py", line 205, in <module>
    y_pos = data[bin_set][-1]
IndexError: list index out of range
mkdir: 无法创建目录"figures": 文件已存在
mv: 无法获取"binning_results.eps" 的文件状态(stat): 没有那个文件或目录
mv: 无法获取"binning_results.png" 的文件状态(stat): 没有那个文件或目录

########################################################################################################################
#####                                          MOVING OVER TEMPORARY FILES                                         #####
########################################################################################################################

mkdir: 无法创建目录"work_files": 文件已存在

------------------------------------------------------------------------------------------------------------------------
-----                      making completion and contamination ranking plots of final outputs                      -----
------------------------------------------------------------------------------------------------------------------------

Loading completion info....
Plotting completion data...
Loading contamination info...
Plotting the contamination data...
Traceback (most recent call last):
  File "/home/suq/miniconda3/envs/metawrap/bin/metawrap-scripts/plot_binning_results.py", line 205, in <module>
    y_pos = data[bin_set][-1]
IndexError: list index out of range
mv: 无法获取"binning_results.eps" 的文件状态(stat): 没有那个文件或目录
mv: 无法获取"binning_results.png" 的文件状态(stat): 没有那个文件或目录

------------------------------------------------------------------------------------------------------------------------
-----                       making contig membership files (for Anvio and other applications)                      -----
------------------------------------------------------------------------------------------------------------------------

summarizing concoct_bins ...
summarizing maxbin2_bins ...
summarizing metabat2_bins ...
summarizing metawrap_70_5_bins ...

########################################################################################################################
#####                                BIN_REFINEMENT PIPELINE FINISHED SUCCESSFULLY!                                #####
########################################################################################################################

real    6m22.159s
user    10m55.608s
sys     1m6.299s
suqian-github commented 3 years ago

I think this problem may related to the previous procedure (bin), for only one file(unbinned.fa) I was got in my conconct bin result, so I add add two line in the file plot_binning_results.py: if len(data[bin_set]) == 0: pass but it also get similar error.