nigyta / dfast_core

DDBJ Fast Annotation and Submission Tool
76 stars 14 forks source link

ORF calling & OrthoSearch #18

Closed YiJessePi closed 4 years ago

YiJessePi commented 4 years ago

Is there a way to start the analysis after ORF calling step? If doing ORF calling with prodigal, does it use the "-c" option, I'm asking since it is not recommended for meta-genomics data. As for the OrthoSearch, does it significantly increase runtime? Is there a default db for this optional step? Thanks!

nigyta commented 4 years ago

Is there a way to start the analysis after ORF calling step?

One way to do so is to import GFF file using --gff option, which is described here. But if you want to annotate metagenomic contigs using prodigal, I think it is much easier to follow the method below.

If doing ORF calling with prodigal, does it use the "-c" option, I'm asking since it is not recommended for meta-genomics data.But

When "--use_prodigal" is specified, prodigal will be performed with default parameters, e.g. "-c" is not used. You can also specify any parameter for prodigal in the cmd_options section of the configuration file (default_settings.py)

            # Prodigal for CDS prediction
            "tool_name": "Prodigal",
            "tool_type": "CDS",
            "enabled": False,
            "options": {
                "transl_table": 11,
                "cmd_options": "",     # <---- set "-p meta" for metagenome data
            },

It is also recommended to set "remove_partial_features" to "False".

As for the OrthoSearch, does it significantly increase runtime? Is there a default db for this optional step? Thanks!

OrthoSearch is not included in the default workflow. It conducts all-vs-all comparison against a reference genome of a single isolate. Therefore, it may not be applicable to metagenome data.

YiJessePi commented 4 years ago

Thanks for the detailed answer. Just want to clarify, I'm working on binned contigs so I'm not sure if I should treat it as we treat metagenomic contigs.

As for OrthoSearch- you think it is not applicable for metgenome beacuse the run time of all-vs-all? or because we can't provide specific isolate?

nigyta commented 4 years ago

@YiJessePi Sorry for late response. I think OrthoSearch can work on metagenome-assembled genomes if a reference genome from close-relative species is available. The run time does not become so long because all-vs-all comparison between bacterial genomes do not take long.