CAMI-challenge / AMBER

AMBER: Assessment of Metagenome BinnERs
https://cami-challenge.github.io/AMBER/
GNU General Public License v3.0
25 stars 7 forks source link

Error in creating html page #41

Closed rmenegaux closed 3 years ago

rmenegaux commented 4 years ago

Hello,

I am encountering an error (pasted below) while running AMBER for taxonomic binning on the CAMI medium complexity dataset.

2020-09-08 19:10:25,561 INFO Loading NCBI files
2020-09-08 19:10:43,554 INFO Loading Gold standard
2020-09-08 19:10:43,599 INFO Loading predictions_10
2020-09-08 19:10:43,616 INFO Creating output directories
2020-09-08 19:10:43,696 INFO Evaluating Gold standard (sample gs, taxonomic binning)
2020-09-08 19:11:05,368 INFO Evaluating predictions_10 (sample gs, taxonomic binning)
/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/src/binning_classes.py:306: RuntimeWarning: invalid value encountered in double_scalars
  (utils_labels.F1_SCORE_BP, [2 * self.__precision_avg_bp * self.__recall_avg_bp / (self.__precision_avg_bp + self.__recall_avg_bp)]),
/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/src/binning_classes.py:313: RuntimeWarning: invalid value encountered in double_scalars
  (utils_labels.F1_SCORE_SEQ, [2 * self.__precision_avg_seq * self.__recall_avg_seq / (self.__precision_avg_seq + self.__recall_avg_seq)]),
/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/src/binning_classes.py:319: RuntimeWarning: invalid value encountered in double_scalars
  (utils_labels.F1_SCORE_PER_BP, [2 * self.__precision_weighted_bp * self.__recall_weighted_bp / (self.__precision_weighted_bp + self.__recall_weighted_bp)]),
/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/src/binning_classes.py:320: RuntimeWarning: invalid value encountered in double_scalars
  (utils_labels.F1_SCORE_PER_SEQ, [2 * self.__precision_weighted_seq * self.__recall_weighted_seq / (self.__precision_weighted_seq + self.__recall_weighted_seq)]),
2020-09-08 19:11:22,665 INFO Saving computed metrics
2020-09-08 19:11:22,872 INFO Creating taxonomic binning plots
/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/src/plots.py:343: UserWarning: FixedFormatter should only be used together with FixedLocator
  axs.set_xticklabels(['{:3.0f}'.format(x * 100) for x in vals], fontsize=11)
...
*(Error above repeated a bunch of times)*

2020-09-08 19:11:46,422 INFO Creating HTML page
Traceback (most recent call last):
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/bin/amber.py", line 302, in <module>
    main()
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/bin/amber.py", line 297, in main
    args.desc)
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/src/amber_html.py", line 848, in create_html
    metrics_row_t = create_taxonomic_binning_html(df_summary, pd_bins[pd_bins['rank'] != 'NA'], labels, sample_ids_list, options)
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/src/amber_html.py", line 777, in create_taxonomic_binning_html
    rank_to_sample_to_html[rank].append(create_table_html(pd_mean_rank.T, is_taxonomic=True))
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/src/amber_html.py", line 450, in create_table_html
    html += df_metrics.style.apply(get_heatmap_colors, df_metrics=df_metrics, axis=1).set_precision(3).set_table_styles(this_style).render()
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/pandas/io/formats/style.py", line 540, in render
    self._compute()
...
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/pandas/core/frame.py", line 467, in __init__
    mgr = init_dict(data, index, columns, dtype=dtype)
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 283, in init_dict
    return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 78, in arrays_to_mgr
    index = extract_index(arrays)
  File "/cbio/donnees/rmenegaux/miniconda3/envs/amber/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 397, in extract_index
    raise ValueError("arrays must all be same length")
ValueError: arrays must all be same length

The command that gets this error is the following:

amber.py predictions_10.binning --gold_standard_file ground_truth_10.binning --ncbi_nodes_file nodes.dmp --ncbi_names_file names.dmp --ncbi_merged_file merged.dmp --filter 1 --output_dir output_filter_1

The nodes files are freshly downloaded off NCBI, and the ground truth and predictions toy files are:

cat predictions_10.binning 
@Version:0.9.1
@SampleID:gs

@@SEQUENCEID    TAXID
RM2|S1|R0   222805
RM2|S1|R1   187303
RM2|S1|R2   1525
RM2|S1|R3   146919
RM2|S1|R4   1488
RM2|S1|R5   305
cat ground_truth_10.binning 
@Version:0.9.1
@SampleID:gs

@@SEQUENCEID    BINID   TAXID   _READID _LENGTH
RM2|S1|R0   1030896 1123266 scaffold00002_27-953956 100
RM2|S1|R1   1220_BD 169973  scaffold9.1_8-4249  100
RM2|S1|R2   1036704 1123003 scaffold00002_48-138142 100
RM2|S1|R3   1285_CK 460257  scaffold2.1_10-583737   100
RM2|S1|R4   evo_1035921.028 745369  contig_5_4-113862   100
RM2|S1|R5   1139_Y  169973  scaffold15.1_21-8412    100

PS: This error does not come systematically, and I managed to make it work for some prediction files.

fernandomeyer commented 3 years ago

Thank you very much for reporting this. It has been fixed in commit 928a9721bbbc9363405b5a69c33407de02c68be7.

Since you are using the filter option, please be aware that it is being modified to filter the smallest bins within each taxonomic rank separately. It was previously filtering bins across all ranks at once.

fernandomeyer commented 3 years ago

Fix now available in v2.0.2.