WrightonLabCSU / DRAM

Distilled and Refined Annotation of Metabolism: A tool for the annotation and curation of function for microbial and viral genomes
GNU General Public License v3.0
249 stars 52 forks source link

annotate genes using faa fasta files #329

Open avera1988 opened 8 months ago

avera1988 commented 8 months ago

Hi,

I would like to use DRAM to annotate some protein predicted form various genomes, I know that in dram there is a flag annotate_genes in the pipeline. However, when I tried to use this as follow:

DRAM.py annotate_genes \
--use_uniref \
-i 52SAGSDRAM/Dasytricha.ruminantium.SAG2.pep.faa \
--threads 64 \
-o DRAM.SAGs.out

I got the following error:

Traceback (most recent call last):
  File "/glittertind/shared_conda_environments/DRAM/bin/DRAM.py", line 211, in <module>
    args.func(**args_dict)
TypeError: annotate_called_genes_cmd() got an unexpected keyword argument 'config_loc'

Looking in the help:

 --config_loc CONFIG_LOC                                                                                                                                                                                           location of an alternive config file that will over write the original at run time, but not be saved or modified (default: None)

But what is this config_loc file or how I can generate it?

I hope you can help us.

Best!

Arturo.

emily-ap commented 1 month ago

Hi, @avera1988 were you able to solve this? I'm running into a similar issue now.

avera1988 commented 1 month ago

Hi @emily-ap , I downgrade to DRAM v 1.4.0 and now it looks it works there. This is the yml file I used:

channels:
  - conda-forge
  - bioconda
dependencies:
  - python=3.10
  - pandas=1.5.2
  - pytest=7.2.0
  - scikit-bio=0.5.7
  - prodigal=2.6.3
  - mmseqs2==13.45111
  - hmmer=3.3.2
  - trnascan-se=2.0.11
  - scipy=1.8.1
  - sqlalchemy=1.4.46
  - barrnap=0.9
  - altair=4.2.0
  - openpyxl=3.0.10
  - networkx=2.8.8
  - ruby=3.1.2
  - parallel=20221122
  - pip
  - pip:
    - DRAM-bio==1.4.0

I run DRAM as follow:

DRAM.py \
annotate_genes \
--use_uniref  \
-i Proteins/'*.faa' \
 --threads 30 \
 -o DRAM.SAGs.out

I hope it works for you as well :-)

Arturo.

JoseLopezArcondo commented 3 weeks ago

To avoid this error, I modified script "annotate_bins.py" (you have to find it in your "miniconda3/envs/DRAM/" folder), and replaced this piece, using nano, with the corrected code that is here at GitHub. Apparently, this part is still wrong in the 1.5.0 version.

corrected version:

def annotate_called_genes_cmd( input_faa, output_dir=".", bit_score_threshold=60, rbh_bit_score_threshold=350, custom_db_name=(), custom_fasta_loc=(), custom_hmm_loc=(), custom_hmm_name=(), custom_hmm_cutoffs_loc=(), use_uniref=False, use_camper=False, use_vogdb=False, kofam_use_dbcan2_thresholds=False, rename_genes=True, keep_tmp_dir=True, low_mem_mode=False, threads=10, verbose=True, log_file_path: str = None, config_loc: str = None, ):

fasta_locs = glob(input_faa)
annotate_called_genes(
    fasta_locs,
    output_dir,
    bit_score_threshold,
    rbh_bit_score_threshold,
    custom_db_name,
    custom_fasta_loc,
    custom_hmm_loc,
    custom_hmm_name,
    custom_hmm_cutoffs_loc,
    use_uniref,
    use_camper,
    use_vogdb,
    kofam_use_dbcan2_thresholds,
    rename_genes,
    keep_tmp_dir,
    low_mem_mode,
    threads,
    verbose,
    log_file_path,
    config_loc,
)