metagenopolis / meteor

Meteor is a plateform for quantitative metagenomics profiling of complex ecosystems. Meteor relies on genes catalogue to perform specie level taxonomic assignments and functional analysis.
GNU General Public License v3.0
12 stars 0 forks source link

misformatted KEGG annotation file for hs_2_9_skin #55

Closed fplazaonate closed 1 month ago

fplazaonate commented 1 month ago

The KEGG annotation file for hs_2_9_skin is misformatted causing meteor to stop prematurely

Here is the current file format:

gene_id gene_name       KO      thrshld score   E-value KO definition
1       CM001149.1_2    K03655  390.5   763.4   1e-228  ATP-dependent DNA helicase RecG [EC:5.6.2.4]
2       CM001149.1_3    K01958  916.07  1695.6  0       pyruvate carboxylase [EC:6.4.1.1]
3       CM001149.1_4    K08316  95.27   192.4   2.8e-56 16S rRNA (guanine966-N2)-methyltransferase [EC:2.1.1.171]

but the expected one is:

gene_id gene_name       annotation
1       CM001149.1_2    K03655
2       CM001149.1_3    K01958
3       CM001149.1_4    K08316

Here is the Python traceback after crashing:

[21382t5@138.102.172.187;10:16:40] Traceback (most recent call last):
[21382t5@138.102.172.187;10:16:40] File "/opt/Miniconda3/envs/global-meteor-2.0.15/bin/meteor", line 10, in <module>
[21382t5@138.102.172.187;10:16:40] sys.exit(main())
[21382t5@138.102.172.187;10:16:40] ^^^^^^
[21382t5@138.102.172.187;10:16:40] File "/opt/Miniconda3/envs/global-meteor-2.0.15/lib/python3.12/site-packages/meteor/meteor.py", line 807, in main
[21382t5@138.102.172.187;10:16:40] profiler.execute()
[21382t5@138.102.172.187;10:16:40] File "/opt/Miniconda3/envs/global-meteor-2.0.15/lib/python3.12/site-packages/meteor/profiler.py", line 659, in execute
[21382t5@138.102.172.187;10:16:40] self.compute_ko_abundance(annot_file=db_filename)
[21382t5@138.102.172.187;10:16:40] File "/opt/Miniconda3/envs/global-meteor-2.0.15/lib/python3.12/site-packages/meteor/profiler.py", line 369, in compute_ko_abundance
[21382t5@138.102.172.187;10:16:40] aggregated_count = merged_df.groupby("annotation")["value"].sum().reset_index()
[21382t5@138.102.172.187;10:16:40] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[21382t5@138.102.172.187;10:16:40] File "/opt/Miniconda3/envs/global-meteor-2.0.15/lib/python3.12/site-packages/pandas/core/frame.py", line 9183, in groupby
[21382t5@138.102.172.187;10:16:40] return DataFrameGroupBy(
[21382t5@138.102.172.187;10:16:40] ^^^^^^^^^^^^^^^^^
[21382t5@138.102.172.187;10:16:40] File "/opt/Miniconda3/envs/global-meteor-2.0.15/lib/python3.12/site-packages/pandas/core/groupby/groupby.py", line 1329, in __init__
[21382t5@138.102.172.187;10:16:40] grouper, exclusions, obj = get_grouper(
[21382t5@138.102.172.187;10:16:40] ^^^^^^^^^^^^
[21382t5@138.102.172.187;10:16:40] File "/opt/Miniconda3/envs/global-meteor-2.0.15/lib/python3.12/site-packages/pandas/core/groupby/grouper.py", line 1043, in get_grouper
[21382t5@138.102.172.187;10:16:40] raise KeyError(gpr)
[21382t5@138.102.172.187;10:16:40] KeyError: 'annotation'
[21382t5@138.102.172.187;10:16:41] Traceback (most recent call last):
[21382t5@138.102.172.187;10:16:41] File "/tmp/jsr223-cpython-6765739577418906912.cpy", line 94, in <module>
[21382t5@138.102.172.187;10:16:41] raise RuntimeError("Error when running meteor profile.")
[21382t5@138.102.172.187;10:16:41] RuntimeError: Error when running meteor profile.
aghozlane commented 1 month ago

This is solved in the new catalogues