Russel88 / CRISPRCasTyper

CCTyper: Automatic detection and subtyping of CRISPR-Cas operons
https://typer.crispr.dk
MIT License
89 stars 16 forks source link

(Question) provide GFF file for pre computed gene calls? #45

Open jolespin opened 8 months ago

jolespin commented 8 months ago

Let's say you already ran prodigal and/or have gene calls in GFF format, can you skip the prodigal run and provide the GFF file?

antiSMASH provides a similar option since it requires gene calls and positions but allows for precomputed GFF file to be used.

jolespin commented 7 months ago

Just checking it in to see how difficult this would be to implement in the current code base?

Russel88 commented 7 months ago

Hi. This is available in the dev branch. I will soonish try to get it in the released branch, but until then you can clone this repo and install the dev branch

jolespin commented 7 months ago

Thank you! I will try it out.

Just in case it's useful for anyone:

git clone --branch dev https://github.com/Russel88/CRISPRCasTyper.git
pip install CRISPRCasTyper/
jolespin commented 7 months ago

I'm testing this out and will log any useful developmental information here:

Here's my usage:

cctyper --db ${CCTYPER_DB} --prodigal single --threads 4 --prot ${PROTEINS} --gff ${GFF} $FASTA $OUT_DIR 
  1. I got an error in the plot module but this resolved it:
try:
    import drawSvg as draw
except ModuleNotFoundError:
    import drawsvg as draw

Also need to change this:

            try:
                self.im.saveSvg(self.out+'plot.svg')
            except AttributeError:
                self.im.save_svg(self.out+'plot.svg')

and this:

                    try:
                        self.im.savePng(self.out+'plot.png')
                    except AttributeError:
                        self.im.save_png(self.out+'plot.png')

This is because of the update in drawsvg: https://github.com/cduck/drawsvg?tab=readme-ov-file#upgrading-from-version-1x

  1. It looks like the following blast intermediate files aren't removed: Flank.*

  2. The plots are not generated.

total 5.4M
-rw-r--r-- 1 jespinoz users  795 Jan 30 23:37 arguments.tab
-rw-r--r-- 1 jespinoz users 5.5K Jan 30 23:41 blast.tab
-rw-r--r-- 1 jespinoz users  916 Jan 30 23:41 cas_operons_orphan.tab
-rw-r--r-- 1 jespinoz users  12K Jan 30 23:38 cas_operons_putative.tab
-rw-r--r-- 1 jespinoz users 2.2K Jan 30 23:38 cas_operons.tab
-rw-r--r-- 1 jespinoz users  387 Jan 30 23:41 CRISPR_Cas.tab
-rw-r--r-- 1 jespinoz users  962 Jan 30 23:41 crisprs_all.tab
-rw-r--r-- 1 jespinoz users  38K Jan 30 23:41 crisprs.gff
-rw-r--r-- 1 jespinoz users  708 Jan 30 23:41 crisprs_near_cas.tab
-rw-r--r-- 1 jespinoz users  299 Jan 30 23:41 crisprs_orphan.tab
-rw-r--r-- 1 jespinoz users  297 Jan 30 23:41 crisprs_putative.tab
-rw-r--r-- 1 jespinoz users 1.4M Jan 30 23:38 Flank.fna
-rw-r--r-- 1 jespinoz users  20K Jan 30 23:38 Flank.ndb
-rw-r--r-- 1 jespinoz users 5.3K Jan 30 23:38 Flank.nhr
-rw-r--r-- 1 jespinoz users  904 Jan 30 23:38 Flank.nin 
-rw-r--r-- 1 jespinoz users  464 Jan 30 23:38 Flank.njs
-rw-r--r-- 1 jespinoz users  788 Jan 30 23:38 Flank.not
-rw-r--r-- 1 jespinoz users 344K Jan 30 23:38 Flank.nsq
-rw-r--r-- 1 jespinoz users  16K Jan 30 23:38 Flank.ntf
-rw-r--r-- 1 jespinoz users  264 Jan 30 23:38 Flank.nto
-rw-r--r-- 1 jespinoz users 383K Jan 30 23:37 genes.tab
drwxr-xr-x 2 jespinoz users  38K Jan 30 23:38 hmmer
-rw-r--r-- 1 jespinoz users    0 Jan 30 23:37 hmmer.log
-rw-r--r-- 1 jespinoz users 109K Jan 30 23:38 hmmer.tab
-rw-r--r-- 1 jespinoz users  11K Jan 30 23:38 minced.out
-rw-r--r-- 1 jespinoz users 3.0M Jan 30 23:37 proteins.faa
drwxr-xr-x 2 jespinoz users 6.0K Jan 30 23:38 spacers
jolespin commented 7 months ago

It looks like the following blast intermediate files aren't removed: Flank.*