xthua / bacant

This program is designed for annotation of antimicrobal resistance(AMR), transposon(Tn) and integron(In) in bacteria.
GNU General Public License v3.0
18 stars 7 forks source link
amr bacteria fasta integron transposon

Author: Xiaoting Hua

Email: xiaotinghua@zju.edu.cn

institute: Key laboratory of Microbiol technology and Bioinformatics of Zhejiang Province

This program is designed for annotation of antimicrobal resistance(AMR), transposon(Tn) and integron(In) in bacteria.

Install:

Bacant is a python3.X script, running on linux. You should install BLAST and add it in environment variable, you can download from https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/. BLAST version is 2.7.1 in bacant.

Run:

BacAnt can accept FASTA and GENBANK format file(single or multi sequences in one file). Attention on GENBANK format file, it should follow standard format. There are three input parameter, "-n" means FASTA, "-g" means GENBANK, "-D" means input dir contains FASTA or GENBANK.

parameter description
--nucleotide(-n) FASTA file
--genbank(-g) GENBANK file
--indir(-D) input dirname
--resultdir(-o) output dirname
--databases(-d) reference databases,default is ResDB,IntegronDB,TransposonDB
--coverages(-c) filtering coverage, default is "60,60,60", three numbers represents AMR,In,Tn in turn
--identities(-i) filtering identity, default is "90,90,90", three numbers represents AMR,In,Tn in turn

Databases:

We have updated database to v2.0(2021.05.11) since BacAnt-v3.3.1. You can download from here. User can define their custom databases, and when run bacant ,just add parameter -p(--path) for databases dirname. Here are databases structure:

  .
  ├── IntegronDB
  │   ├── Integron.fasta    Integron reference sequences in FASTA format
  │   │                     sequence id must be description|accession,eg: In0|PAU49101
  │   ├── Integron.nhr
  │   ├── Integron.nin
  │   └── Integron.nsq
  ├── ResDB
  │   ├── Res.fasta         Resistance gene reference sequences in FASTA format
  │   │                     sequence id must be database name~~~gene~~~accession~~~description,
  │   │                     eg:  ncbi~~~1567214_ble~~~NG_047553.1~~~BLEOMYCIN BLMA family bleomycin binding protein
  │   ├── Res.nhr
  │   ├── Res.nin
  │   └── Res.nsq
  └── TransposonDB
      ├── Transposon.fasta  Transposon reference sequences in FASTA format
      │                     sequence id must be description|accession,eg: Tn2009|CP001937
      ├── Transposon.nhr
      ├── Transposon.nin
      └── Transposon.nsq

Output:

filename description
*.gb GENBANK format annotation
AMR.tsv filtered resistance annotation
AMR.possible.tsv all possible resistance annotation
replicon.tsv replicon annotation
integron.filter.tsv most like integron
integron.detail.tsv integron_finder result,detail descripton of integron structure
transposon.filter.tsv transposon element after overlap screen
transposon.possible.tsv all possible transposon element
annotation.html output visualization