Closed mictadlo closed 4 years ago
interpro will not be in chromosomal coordinates but in protein coordinates since it is annotating domains in proteins. if you want to map proteins domains to genomic coordinates you need to go through a transformation. Here's an example of how I did this with Pfam domains many years ago with BioPerl. There may be other alternative ways to do this but I would reach out to JBrowse developers as to how they reccomend adding protein domain tracks for genes in chromosome space.
https://github.com/hyphaltip/genome-scripts/blob/master/gbrowse_tools/map_hmmertab2genome.pl
If you want to add functional annotation to your coding gene model predictions, then you can pass your FASTA genome + GFF annotation + InterPro XML annotation file to funannotate annotate
:
funannotate annotate --fasta genome.fa --gff braker.gff3 --iprscan protIPR.xml \
--out output_folder --species "Genus species"
This will extract protein models and then assign functional annotation to those predictions.
To re-predict gene models with funannotate, I would suggest starting from the beginning of the workflow if you have RNA-seq data. That would be mask --> train --> predict --> update --> annotate.
Hi, I ran into this problem:
docker run -it --rm -v $PWD:/home/linuxbrew/data nextgenusfs/funannotate
~/data$ funannotate annotate --fasta NbV1ChF.fasta --gff braker-NbAllMerged-BAM-soft_utr.gff3 --iprscan augustus.hints_utr.aa.xml --out out --species "NBenth"
What did I miss and I also tried to funannotate setup
but which argument should I use?
Thank you in advance.
Michal
What’s the error?
$ funannotate annotate --fasta NbV1ChF.fasta --gff braker-NbAllMerged-BAM-soft_utr.gff3 --iprscan augustus.hints_utr.aa.xml --out out --species "NBenth"
-------------------------------------------------------
[02:58 PM]: OS: linux2, 4 cores, ~ 5 GB RAM. Python: 2.7.15
[02:58 PM]: Running funannotate v1.5.3
[02:58 PM]: Database files not found in /home/linuxbrew/DB, run funannotate database and/or funannotate setup
Okay -- likely due to the MiBIG database moving location. So I assume you got an error during the docker image build? I'll have to tag a new release to fix this -- which we are planning shortly.
I used docker run -it --rm -v $PWD:/home/linuxbrew/data nextgenusfs/funannotate
and therefore I do not think I though any errors during the docker build.
You need to follow the directions here: https://funannotate.readthedocs.io/en/latest/docker.html#docker. The base image nextgenusfs/funannotate
requires a few more steps by the user due to licensing issues. This step then also sets up the databases in the docker image. I'm pushing v1.6.0 shortly to the docker cloud which should fix several issues but one being the link to a database that was broken.
It appears that RepBase is not anymore free for academics. Will your pipeline work also without RepBase?
Current repeat masking is done with the funannotate mask command, this uses repeatmasker. You can mask with any other software. Funannotate predict will warm you if your assembly is not masked, but you can bypass that warning.
Hi Unfortunately, I discover your project too late and ran BRAKER2 and additionally I ran InterProScan which produced GFF3, TSV and XML. It appears that the InterProScan's GFF3 file does not contain any chromosome names:
Is there a solution how to load InterProScan results as a track into JBrowse or is it still possible combine InterProScan and BRAKER2 with your scripts?
Thank you in advance,
Michal