merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
445 stars 145 forks source link

Interpro results for pangenomics workflow #1017

Closed johanneswerner closed 6 years ago

johanneswerner commented 6 years ago

How can I add interpro results to external genomes for the pangenomics workflow?

I tried this:

$ ./interproscan.sh \
  -i Sulfurimonas_autotrophica.faa \
  -f tsv \
  -o Sulfurimonas_autotrophica.interpro.tsv

$ anvi-import-functions \
  -c ../reference_genomes/Sulfurimonas_autotrophica_DSM_16294_genomic_refseq.db \
  -i Sulfurimonas_autotrophica.interpro.tsv

File/Path Error: Not all lines in the file 'Sulfurimonas_autotrophica.interpro.tsv' have equal
                 number of fields...     
$ anvi-self-test --version

anvi-self-test --version
Anvi'o version ...............................: margaret (vunknown)
Profile DB version ...........................: 29
Contigs DB version ...........................: 12
Pan DB version ...............................: 12
Genome data storage version ..................: 6
Auxiliary data storage version ...............: 2
Structure DB version .........................: 1

Do you have any suggestions? Thank you so much.

xvazquezc commented 6 years ago

Try the script I have here https://github.com/xvazquezc/stuff/iprs2anvio.sh it should do the trick, in Linux at least. Then import the output file with anvi-import-functions as usual.

johanneswerner commented 6 years ago

That is great, thank you, it is working perfectly.

But I think you sent the wrong link: https://raw.githubusercontent.com/xvazquezc/stuff/master/iprs2anvio.sh

ozcan commented 6 years ago

Hi,

Additional to @xvazquezc's script, there is also built-in parser for interproscan in anvi-import-functions can be used with -p interproscan parameter.

  -p PARSER, --parser PARSER
                        Parser to make sense of the input files (if you need
                        one). There are currently 1 parsers readily available:
                        ['interproscan']. IT IS OK if you do not select a
                        parser if you have a standard, TAB-delimited input
                        file for funcitonal annotation of genes. If this is
                        not like 2018 and everything is already outdated, you
                        should be able to go to this address and learn
                        everything you need like a boss:
                        http://merenlab.org/2016/06/18/importing-functions/

Best,