TAMU-CPT / training-material

A collection of Galaxy-related training material
https://training.galaxyproject.org
Other
3 stars 9 forks source link

Create a GTN format tutorial on adding evidence tracks #18

Closed jrr-cpt closed 4 years ago

jrr-cpt commented 5 years ago

@ToniNittolo Base this on the workflow Upload Annotationed Sequence to Apollo v1.0

In the background give scenarios for which this may be used.

  1. Put a previous annotation version (deposited in Genbank for example) in as a comparative evidence track
  2. You ran an analysis like blasting your genome against another database in Galaxy and want to add the result as an evidence track. This is for things that are not part of the standard automatic phage annotation pipeline
ToniNittolo commented 5 years ago

From @MoffMade - "The Upload Annotated Sequence to Apollo (you may need a new link, I saw a typo in the name) is for when you are creating a new organism in Apollo and already have gff3 annotations that you want to upload directly as annotations. They get added directly instead of going in as evidence tracks which would then need to be promoted to features. It skips the evidence step and just makes the features"

I am not entirely sure that this workflow creates evidence tracks. Should I base this tutorial on the JBrowse + CPT 0.6.3 tool?

ToniNittolo commented 5 years ago

Furthermore from @MoffMade - "The Upload Annotated Sequence workflow creates zero evidence tracks It only generates new organisms, it will not modify an existing organism The jbrowse+CPT tool CAN modify existing organisms but the workflow does not use that function. It uses the generate a new data directory option. "Custom BLASTp to Apollo Record" or "antiCRISPRdb to Apollo track" workflows should be examples of using the jbrowse+CPT tool to add an evidence track to an existing organism, and make use of the Retrieve tool Or at least, their inputs should use the results of the Retrieve tool"

jrr-cpt commented 5 years ago

@MoffMade please clarify on the questions here.

MoffMade commented 4 years ago

I'm not certain what questions are being asked here? Evidence tracks are generated from various tool analysis and then added to the organism in apollo by Retrieving JBrowse Directory from Apollo, adding the track with the JBrowse+cpt tool, and then reuploading the changed directory to Apollo with the Create or Update Organism tool

Does this answer it?

jrr-cpt commented 4 years ago

I have since clarified any remaining questions. What we need now is the tutorial with the above information for users to access. The material should cover which current tools and/or workflows to use for 1) adding a custom evidence track, such as from a gff3 or BLAST analysis, to an existing organism; 2) using annotations to generate a new Apollo organism with features already called.

jrr-cpt commented 4 years ago

I have been using a workflow called GFF3 to evidence track JRR v1.1 to add gff3 evidence tracks to Apollo organisms. Some aspects of doing this that will need to be addressed in the tutorial include: 1) uploaded gff3 not listed as an input option, because the auto-detect classified it a a gff file, and it must be manually changed

Screen Shot 2020-03-19 at 10 11 54 Screen Shot 2020-03-19 at 10 12 24

2) How to ensure the evidence track categories have the name you want:

3) Most common reason that this workflow will fail is if the names of the comparison gff3 and the Apollo organism do not match. Workflow prompts for name change, but the parameters have to be manually expanded.

Screen Shot 2020-03-19 at 10 24 03
jrr-cpt commented 4 years ago

The base files for this tutorial already exist under training-material/topics/additional-analyses/tutorials/adding-evidence-tracks/ Re-Title: Adding custom evidence tracks to Apollo

Agenda:

Intro The analyses that the basic Structural and Functional workflows (link to those tutorials) provide a good start for phage genome annotation. When additional, updated, or custom analyses are performed, those data can also be added to Apollo as evidence tracks if they are in the appropriate format. This includes custom BLAST analyses, comparison annotation data, or any properly formatted GFF3 data with coordinates that match the genome for the organism in question.

Custom Workflows

  1. Published workflow here called GFF3 to Apollo evidence track to add gff3 evidence tracks to Apollo organisms. This is usually used on data coming from Genbank and allows comparison to features called by others.
  2. Published workflow: Custom BLASTp to Apollo Record with variations UserDB and LocalDB
  3. Published workflow: antiCRISPRdb to Apollo track

Troubleshooting When adding custom evidence tracks is not working for you, consider the following reasons for remediating the problem.

  1. If your custom data is not properly formatted, the category may be added to Apollo, but with empty evidence tracks. In that case, the data formatting needs to be carefully inspected to ensure it fits the spec required for that file type. Users are encouraged to compare their data files to datasets in Galaxy that are known to be successfully added to Apollo. Additionally, check to make sure that any conversions performed in Galaxy did not yield empty datasets.

  2. continue by listing the things I wrote above on Mar 19