Closed DininduSenanayake closed 1 year ago
@mlhoggard @JSBoey
I have ran the 2021 dataset with the module and databases
module purge
module load DRAM/1.3.5-Miniconda3
DRAM.py annotate -i '/nesi/nobackup/nesi99999/Dinindu/ZDissues/DRAM/10.gene_annotation/predictions/*.filtered.fna' \
--checkm_quality ./DRAM_input_files/checkm.txt \
--gtdb_taxonomy ./DRAM_input_files/gtdbtk.bac120.classification_pplacer.tsv \
-o annotation_dram
10 fastas found
2022-09-27 21:08:07.425787: Annotation started
0:00:00.010126: Retrieved database locations and descriptions
0:00:00.010159: Annotating bin_2.filtered
0:00:11.791169: Turning genes from prodigal to mmseqs2 db
0:00:15.168229: Getting hits from kofam
0:20:00.963188: Getting forward best hits from peptidase
0:20:48.065156: Getting reverse best hits from peptidase
0:20:50.957036: Getting descriptions of hits from peptidase
0:20:52.987563: Getting hits from pfam
0:22:25.164596: Getting hits from dbCAN
0:22:39.870252: Merging ORF annotations
0:23:10.327467: Annotating bin_3.filtered
0:23:19.561439: Turning genes from prodigal to mmseqs2 db
0:23:22.726461: Getting hits from kofam
0:51:41.803827: Getting forward best hits from peptidase
0:54:20.144962: Getting reverse best hits from peptidase
0:54:22.091614: Getting descriptions of hits from peptidase
0:54:22.311586: Getting hits from pfam
0:54:52.395768: Getting hits from dbCAN
0:55:06.018509: Merging ORF annotations
0:55:19.145917: Annotating bin_6.filtered
0:56:15.289438: Turning genes from prodigal to mmseqs2 db
0:56:18.469674: Getting hits from kofam
1:16:01.543374: Getting forward best hits from peptidase
1:16:41.862112: Getting reverse best hits from peptidase
1:16:44.242180: Getting descriptions of hits from peptidase
1:16:44.343257: Getting hits from pfam
1:17:12.361333: Getting hits from dbCAN
1:17:23.749997: Merging ORF annotations
1:17:36.258055: Annotating bin_5.filtered
1:18:19.555170: Turning genes from prodigal to mmseqs2 db
1:18:23.042739: Getting hits from kofam
1:58:59.237550: Getting forward best hits from peptidase
2:00:24.073617: Getting reverse best hits from peptidase
2:00:26.639439: Getting descriptions of hits from peptidase
2:00:26.654719: Getting hits from pfam
2:01:17.012038: Getting hits from dbCAN
2:01:41.626553: Merging ORF annotations
2:02:11.866737: Annotating bin_4.filtered
2:02:44.305314: Turning genes from prodigal to mmseqs2 db
2:02:47.498150: Getting hits from kofam
2:20:09.827544: Getting forward best hits from peptidase
2:20:49.761983: Getting reverse best hits from peptidase
2:20:51.751197: Getting descriptions of hits from peptidase
2:20:51.758468: Getting hits from pfam
2:21:19.469995: Getting hits from dbCAN
2:21:31.340825: Merging ORF annotations
2:21:43.847936: Annotating bin_1.filtered
2:21:50.559360: Turning genes from prodigal to mmseqs2 db
2:21:53.693719: Getting hits from kofam
2:35:00.530102: Getting forward best hits from peptidase
2:35:25.014360: Getting reverse best hits from peptidase
2:35:26.851579: Getting descriptions of hits from peptidase
2:35:26.857354: Getting hits from pfam
2:35:50.375718: Getting hits from dbCAN
2:35:58.636778: Merging ORF annotations
2:36:07.613436: Annotating bin_8.filtered
2:36:40.557629: Turning genes from prodigal to mmseqs2 db
2:36:43.733567: Getting hits from kofam
2:52:17.062601: Getting forward best hits from peptidase
2:52:52.539920: Getting reverse best hits from peptidase
2:52:54.499947: Getting descriptions of hits from peptidase
2:52:54.516884: Getting hits from pfam
2:53:22.130619: Getting hits from dbCAN
2:53:33.435468: Merging ORF annotations
2:53:43.496087: Annotating bin_9.filtered
2:54:28.957477: Turning genes from prodigal to mmseqs2 db
2:54:32.242714: Getting hits from kofam
3:21:16.575826: Getting forward best hits from peptidase
3:22:09.926156: Getting reverse best hits from peptidase
3:22:12.038791: Getting descriptions of hits from peptidase
3:22:12.093994: Getting hits from pfam
3:22:46.216410: Getting hits from dbCAN
3:23:02.893923: Merging ORF annotations
3:23:20.891607: Annotating bin_7.filtered
3:23:57.393779: Turning genes from prodigal to mmseqs2 db
3:24:00.581757: Getting hits from kofam
3:42:16.390080: Getting forward best hits from peptidase
3:43:00.994574: Getting reverse best hits from peptidase
3:43:02.982096: Getting descriptions of hits from peptidase
3:43:03.018580: Getting hits from pfam
3:43:33.362493: Getting hits from dbCAN
3:43:47.126957: Merging ORF annotations
3:44:09.227178: Annotating bin_0.filtered
3:44:16.725190: Turning genes from prodigal to mmseqs2 db
3:44:19.923438: Getting hits from kofam
4:08:15.934225: Getting forward best hits from peptidase
4:08:58.152196: Getting reverse best hits from peptidase
4:09:00.064520: Getting descriptions of hits from peptidase
4:09:00.072982: Getting hits from pfam
4:09:32.307946: Getting hits from dbCAN
4:09:46.452562: Merging ORF annotations
4:10:00.655341: Annotations complete, processing annotations
However, it did trigger this error. I have a feeling a threshold or somesorts is not compatible with the latest DRAM. Should be an easy fix in an upstream step.
/opt/nesi/CS400_centos7_bdw/DRAM/1.3.5-Miniconda3/lib/python3.10/site-packages/mag_annotator/annotate_bins.py:603: UserWarning: No rRNAs were detected, no rrnas.tsv file will be created.
warnings.warn('No rRNAs were detected, no rrnas.tsv file will be created.')
Traceback (most recent call last):
File "/opt/nesi/CS400_centos7_bdw/DRAM/1.3.5-Miniconda3/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3621, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'classification'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/nesi/CS400_centos7_bdw/DRAM/1.3.5-Miniconda3/bin/DRAM.py", line 189, in <module>
args.func(**args_dict)
File "/opt/nesi/CS400_centos7_bdw/DRAM/1.3.5-Miniconda3/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 1040, in annotate_bins_cmd
annotate_bins(list(set(fasta_locs)), output_dir, min_contig_size, prodigal_mode, trans_table, bit_score_threshold,
File "/opt/nesi/CS400_centos7_bdw/DRAM/1.3.5-Miniconda3/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 1092, in annotate_bins
taxonomy.append(gtdb_taxonomy.loc[i, 'classification'])
File "/opt/nesi/CS400_centos7_bdw/DRAM/1.3.5-Miniconda3/lib/python3.10/site-packages/pandas/core/indexing.py", line 960, in __getitem__
return self.obj._get_value(*key, takeable=self._takeable)
File "/opt/nesi/CS400_centos7_bdw/DRAM/1.3.5-Miniconda3/lib/python3.10/site-packages/pandas/core/frame.py", line 3615, in _get_value
series = self._get_item_cache(col)
File "/opt/nesi/CS400_centos7_bdw/DRAM/1.3.5-Miniconda3/lib/python3.10/site-packages/pandas/core/frame.py", line 3931, in _get_item_cache
loc = self.columns.get_loc(item)
File "/opt/nesi/CS400_centos7_bdw/DRAM/1.3.5-Miniconda3/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3623, in get_loc
raise KeyError(key) from err
KeyError: 'classification'
Thanks @DininduSenanayake !
I'll give it a test run with another data set as well, but I don't think that rRNA error is anything to worry about. rRNA often doesn't assemble that well from short reads anyway, so it's not uncommon for DRAM not to detect any. As long as the rest of the annotation process looks like it worked as normal, then I suspect that's all that warning/KeyError is about.
DRAM/1.3.5-Miniconda3
is definitely working. (Confirmed by Otago microbio group as well). Therefore, I will mark this as solved for the moment. We can re-open it for any related issues.
Upgrade DRAM : See whether we can have it available as a module and store the database on
/opt/nesi/db
.