merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
413 stars 142 forks source link

Error running `anvi-reaction-network` #2142

Open IsabelFE opened 8 months ago

IsabelFE commented 8 months ago

Short description of the problem

KeyError: 'reaction_network_ko_annotations_hash' when running anvi-reaction-network

anvi'o version

Anvi'o .......................................: marie (v8)
Python .......................................: 3.10.12

Profile database .............................: 38
Contigs database .............................: 21
Pan database .................................: 16
Genome data storage ..........................: 7
Auxiliary data storage .......................: 2
Structure database ...........................: 2
Metabolic modules database ...................: 4
tRNA-seq database ............................: 2

System info

Mac Sonoma 14.0 (Chip M1). Anvio installed followed the instructions

Detailed description of the issue

I have a contigs.db file and when I try to run anvi-reaction-network I get this error:

anvi-reaction-network -c Cke_FDAARGOS_1055.db                             
* A reaction network will be made from protein orthology annotations in the
  contigs database.
Contigs database .............................: Cke_FDAARGOS_1055.db
Traceback (most recent call last):                                                                                                                                                     
  File "/Users/isabelfe/miniconda3/envs/anvio-8/bin/anvi-reaction-network", line 56, in <module>
    main()
  File "/Users/isabelfe/miniconda3/envs/anvio-8/bin/anvi-reaction-network", line 27, in main
    constructor.make_network(contigs_db=args.contigs_db, overwrite_existing_network=args.overwrite_existing_network)
  File "/Users/isabelfe/miniconda3/envs/anvio-8/lib/python3.10/site-packages/anvio/biochemistry/reactionnetwork.py", line 1943, in make_network
    network = self.make_contigs_database_network(contigs_db, store=store, overwrite_existing_network=overwrite_existing_network)
  File "/Users/isabelfe/miniconda3/envs/anvio-8/lib/python3.10/site-packages/anvio/biochemistry/reactionnetwork.py", line 1990, in make_contigs_database_network
    if store and contigs_super.a_meta['reaction_network_ko_annotations_hash'] and not overwrite_existing_network:
KeyError: 'reaction_network_ko_annotations_hash'

anvi-setup-kegg-data and anvi-setup-modelseed-database were run already

ivagljiva commented 8 months ago

Hey @semiller10 , I think this error is occuring because the reaction_network_ko_annotations_hash value has not been set previously in the contigs database self table. Perhaps it could be fixed by adding an if statement to check if the value exists before trying to access it?

meren commented 8 months ago

We used to solve similar situations with migration scripts that add new keys with None values. Perhaps that would be a better way to do it to ensure it is a constant part of every contigs-db?

semiller10 commented 8 months ago

I set a value of None for that self variable in the migration script, and her contigs database has apparently been updated to v21, so I'm a little confused.

meren commented 8 months ago

Oh! Sorry for missing that! I wonder if there was a manual update of versions in the SQlite database :p

meren commented 8 months ago

For posterity, I see it here that it is indeed required to be a part of any contigs-db that is upgraded to v21:

https://github.com/merenlab/anvio/blob/master/anvio/migrations/contigs/v20_to_v21.py

But is the variable also set for new contigs-db files that are not upgraded?

The answer to that seems to be a No :)

image

semiller10 commented 8 months ago

Exactly, I have since fixed that in a branch which I will now merge into master. I'll have her rerun with anvi'o dev and test.

semiller10 commented 8 months ago

@IsabelFE Sorry for the bug. Would you mind installing the anvi'o development branch and regenerating your contigs database and rerunning anvi-reaction-network using that version of the codebase? Here are installation instructions: https://anvio.org/install/#development-version

If everything works, you can close out this issue. Thank you!

meren commented 8 months ago

Thank you, @semiller10. And thank you very much for the detailed report and helping us identify these problems, @IsabelFE.

semiller10 commented 8 months ago

@IsabelFE I just fixed one other related thing in the development branch, so make sure the codebase is fully updated to this point before running anvi-reaction-network.

IsabelFE commented 8 months ago

I do have 117 contig databases fully annotated with all databases. Do I need to regenerate them? Or only redo the KEGG annotation?

semiller10 commented 8 months ago

@IsabelFE In that case, I'll spare you the pain and waste of regenerating all of them. I'll post a little script here in a minute that you can run on each database to update it. It's a very simple fix involving three database "metavariables" that were not initialized with empty values when the databases were created. Then you can run anvi-reaction-network using the dev branch and report back.

meren commented 8 months ago

@semiller10, a better solution may be to increase the version number of the contigs-db once again, and have a matching migration script so anyone who may have suffered from this would be forced to have the right version. My 2 cents.

semiller10 commented 8 months ago

@meren Yes, I think that would be appropriate, so that other people working in v8 can be pointed to the dev branch rather than an ad hoc script.

Mild-High commented 8 months ago

Hey @semiller10, just a heads up that anvi-setup-modelseed-database tries to setup the database in .../github/anvio/anvio/data/MISC, which doesn't exist, instead of .../github/anvio/anvio/data/misc with the other databases. Cheers

ivagljiva commented 8 months ago

@Mild-High the issue with the directory name was just fixed by Meren via commit bdfb50e and will be good to go in anvi'o dev :)

semiller10 commented 8 months ago

@IsabelFE @Mild-High Very sorry for the delay in returning to this issue. Please find fixes in the anvi'o development branch to both of your problems: see pull request https://github.com/merenlab/anvio/pull/2151

IsabelFE commented 8 months ago

Hi @semiller10 I installed the development branch, ran anvi-migrate on my databases and then anvi-reaction-network and finally anvi-get-metabolic-model-file. All good. Now, I just need to figure out what to do with the output ;)

semiller10 commented 8 months ago

@IsabelFE The two tables of reactions and metabolites hypothesized from the genome that are stored in the database may be more convenient to work with than a model file for your purposes. If you want to do this yourself now, they are called gene_function_reactions and gene_function_metabolites in the contigs database. I realize I should also add a flag to simply export these tables in anvi-get-metabolic-model-file, which I'll do soon.

IsabelFE commented 8 months ago

I was wondering if there was a way to export those tables. They get mentioned on the documentation, but I didn't know how to access them.

IsabelFE commented 8 months ago

I've just realized that anvio databases can be accessed via SQL, so I got hold of those tables. It will be useful to have a small description about what is encoded in each table and their corresponding columns. Thanks!

IsabelFE commented 8 months ago

@semiller10 I was able to get the tables from the anvio contig databases. Any recommendations on how to get started on visualizing the models. Maybe some pointers on how to get the files imported into Escher or any other recommended next steps?

Mild-High commented 8 months ago

@IsabelFE @Mild-High Very sorry for the delay in returning to this issue. Please find fixes in the anvi'o development branch to both of your problems: see pull request #2151

Thank you! I'm currently reaction networking away :)