Closed KDeaton closed 2 years ago
Hi @KDeaton,
For 1. I think the issue comes from the reading of the taxon file. The KeyError
indicates that the organism name (found by metage2metabo) is not found in the taxon_named_species
dictionary created from the taxon file. Can you share some lines of the taxon file to see if there is a possible problem in this file (especially in the order of the columns)?
For 2. can you share the command used (with the singularity call)? The log you shared shows that the SVG creation has been bypassed as it was not given as input which is really strange.
Hi Arnaud, thanks for the prompt response as always!
For 1. I used the same column headers as the the taxon_id.tsv file in the tutorial. The taxonomy looks like it was read correctly in the taxonomy_species.tsv created during the m2m_analysis enumeration. a. The taxon_id.tsv (converted to csv for upload) taxon_id.csv
b. taxonomy_species.tsv from m2m_analysis taxonomy_species.csv
For 2. I did fix one thing, which was to direct to my original Oog.jar file in the Setup directory with the singularity. After this fix, an svg directory was created, but the process stopped after enumeration. Here is my updated sbatch script: singularity exec -B /hpc/group/deshusseslab:/hpc/group/deshusseslab/ Setup/metage2metabo-metacom_singularity_latest.sif m2m_analysis workflow -n genomes/taxcurated/ -s seeds/seeds_tol_sty3.sbml -t targets/final2 -o analysis/19analysis --taxon analysis/taxon_id.tsv --oog Setup/Oog.jar --level genus
error message was the same:
Traceback (most recent call last):
File "/usr/local/bin/m2m_analysis", line 8, in
out file: 19analysisout.txt
Hi @KDeaton,
After testing the taxon_id file, I have found your issue: there was a space at the beginning of each organism ID in each row, such as:
species | taxon_id |
---|---|
' GCA_001630725' | 303 |
' GCA_900009125' | 85698 |
So the ID of an organism was ' GCA_000494915' and not 'GCA_000494915'. And when M2M tried to map the results of the enumeration with the dictionary from the taxonomy file, it failed.
I have removed the space from the file, can you try it? taxon_id_no_space.csv
For error 2, what was the path that you give for the Oog.jar before? It could be indeed a conflict with the known path of the Singularity that was not able to find the Oog.jar file.
That should be it! Thank you. I thought I had checked for that before, but I missed it. The initial rerun looks good, but now I need to add java to Path and rerun again, so I don't have final confirmation that was the issue, but seems likely.
Everything is working for me now! The taxon_id issue was resolved by eliminating the extra space. I resolved the java path issue by omitting "-oog filepathname/Oog.jar" in the m2m_analysis workflow command since that path is specified in the singularity recipe now.
Hi! I am running m2m with a singularity on a cluster. I get the following error when running m2m_analysis graph (below). I get the same error with other genomes too. Have you encountered this error before?
Edit: Actually I have an issue when running without using the taxon function as well. Enumeration runs fine. The html was created fine (as I've shared with you before, I was able to simplify it quite a bit by separating out individual pathways). But the svg was not created.
I think these are likely two separate issues:
Error log for running analysis with --taxon:
Traceback (most recent call last): File "/usr/local/bin/m2m_analysis", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/metage2metabo/main_analysis__.py", line 288, in main
main_analysis_workflow(network_dir, args.targets, args.seeds, args.out, args.taxon,
File "/usr/local/lib/python3.8/dist-packages/metage2metabo/main_analysis.py", line 303, in main_analysis_workflow
run_analysis_workflow(*allargs)
File "/usr/local/lib/python3.8/dist-packages/metage2metabo/m2m_analysis/m2m_analysis_workflow.py", line 46, in run_analysis_workflow
gml_output = graph_analysis(json_file_folder, target_folder_file, output_dir, taxon_file, taxonomy_level)
File "/usr/local/lib/python3.8/dist-packages/metage2metabo/m2m_analysis/solution_graph.py", line 58, in graph_analysis
create_gml(json_paths, target_paths, output_dir, taxonomy_output_file)
File "/usr/local/lib/python3.8/dist-packages/metage2metabo/m2m_analysis/solution_graph.py", line 124, in create_gml
key_species_data[target_category]['essential_symbionts'][taxon] = [organism for organism in key_species_types if key_species_types[organism] == 'ES' and taxon_named_species[organism].split('')[0] == taxon]
File "/usr/local/lib/python3.8/dist-packages/metage2metabo/m2m_analysis/solution_graph.py", line 124, in
key_species_data[target_category]['essential_symbionts'][taxon] = [organism for organism in key_species_types if key_species_types[organism] == 'ES' and taxon_named_species[organism].split('__')[0] == taxon]
KeyError: 'GCA_900475215'
.................. Here is the end of the analysis workflow log for graph and powergraph sections:
######### Graph of targets_tolp1_sty ######### Number of nodes: 17 Number of edges: 70 --- Graph runtime 0.03 seconds ---
######### Graph compression: targets_tolp1_sty ######### Number of powernodes: 2 Number of poweredges: 1 Compression runtime 0.94 seconds ---
######### PowerGraph visualization: targets_tolp1_sty ######### ######### Creation of the powergraph website accessible at analysis/09analysis/html/targets_tolp1_sty ######### --- Powergraph runtime 1.05 seconds ---