Open philipwoods opened 3 months ago
@philipwoods, thanks for posting this bug!
Weirdly, I was not able to reproduce it on my end but I went ahead and refactored self.convert_to_df()
to useDataFrame.concat()
which should fix this issue. If possible, could you post a tar gzipped directory of the pangenome and command you used? I want to reproduce it before I commit.
Here is the branch tracking this issue: https://github.com/merenlab/anvio/compare/master...deprecate-pandas-append-synteny
Sorry for the delay! Here is the file and the command I used (I forget whether --annotation-source
is necessary when using gene clusters as the ngram source, but if it is you can use --annotation-source COG20_FUNCTION
).
pangenome.tar.gz
anvi-analyze-synteny --analyze-unknown-functions -n gene_clusters --ngram-window-range 3:15 -g ANME3EVO-revision-GENOMES.db -p pangenome/ANME3EVO-revision-PAN.db
I run your command in @mschecht's branch, and got this error:
Functions found ..............................: EGGNOG_BEST_TAX, Pfam, COG20_CATEGORY, EGGNOG_BACT, COG20_FUNCTION, EGGNOG_PFAMs, EGGNOG_COG_CATEGORY, EGGNOG_BRITE, KEGG_BRITE, EGGNOG_KEGG_KO, KOfam, EGGNOG_GENE_FUNCTION_NAME,
EGGNOG_KEGG_REACTION, EGGNOG_BiGG_REACTIONS, EGGNOG_KEGG_MODULE, EGGNOG_KEGG_PATHWAYS, EGGNOG_KEGG_TC, KEGG_Class, EGGNOG_EC_NUMBER, KEGG_Module, COG20_PATHWAY, EGGNOG_KEGG_RCLASS,
EGGNOG_CAZy, EGGNOG_GO_TERMS
Genomes storage ..............................: Initialized (storage hash: hash45b805d1)
Num genomes in storage .......................: 67
Num genomes will be used .....................: 67
WARNING
===============================================
Anvi'o is now looking for Ngrams in your contigs!
* What do we say to loci that appear to have no coherent synteny patterns...? Not
today! ⚔️
Traceback (most recent call last):
File "/Users/meren/github/anvio/bin/anvi-analyze-synteny", line 74, in <module>
ngram.report_ngrams_to_user()
File "/Users/meren/github/anvio/anvio/synteny.py", line 420, in report_ngrams_to_user
df = self.convert_to_df()
File "/Users/meren/github/anvio/anvio/synteny.py", line 408, in convert_to_df
ngram_count_df_final = pd.concat(ngram_count_df_list, ignore_index=True)
File "/Users/meren/miniconda3/envs/anvio-dev/lib/python3.10/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/Users/meren/miniconda3/envs/anvio-dev/lib/python3.10/site-packages/pandas/core/reshape/concat.py", line 347, in concat
op = _Concatenator(
File "/Users/meren/miniconda3/envs/anvio-dev/lib/python3.10/site-packages/pandas/core/reshape/concat.py", line 404, in __init__
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
So it is some improvement, but more things to fix clearly :)
Short description of the problem
anvi-analyze-synteny
fails because Pandas has deprecatedDataFrame.append()
as of version 1.4.0 in favor ofpandas.concat()
.anvi'o version
System info
Operating system is RedHat enterprise Linux. Anvi'o was installed in a conda environment.
Detailed description of the issue
I ran
anvi-analyze-synteny
on my pangenome and got the following error:Looking into it, I found that requirements.txt forces pandas==1.4.4, while DataFrame.append() has been deprecated since pandas version 1.4.0. Therefore I expect that this will be an issue in every part of anvi'o that currently uses the pandas DataFrame.append() method.