Russel88 / CRISPRCasTyper

CCTyper: Automatic detection and subtyping of CRISPR-Cas operons
https://typer.crispr.dk
MIT License
89 stars 16 forks source link

Extract spacer coordinates #15

Closed genomesandMGEs closed 2 years ago

genomesandMGEs commented 3 years ago

Hey,

This tool (super cool btw!) extracts all spacers (true and false) in fasta file into a separate directory. Could you please let me know how to extract the coordinates for all these spacers? The coordinates available on the file crisprs_all only refer to the consensus repeat, from what I understood.

Thanks for your time!

Russel88 commented 3 years ago

Hey,

I'm glad you like the tool! Extracting coordinates of all spacers is not that straightforward, unfortunately. The start and end in crisprs_all.tab refer to start and end of the entire CRISPR array.

If you run cctyper with --keep_tmp it will produce the raw output from minced in minced.out; from this file you can extract the positions of all repeats and spacers. The newest version of cctyper also detects CRISPR arrays by BLASTing repeats, and positions of these spacers is not readily obtainable. However, these arrays are somewhat rare (except if you're looking for IV-A3 CRISPRs specifically).

It would probably be a good idea to make cctyper output a gff with all the positions in a standardized way. I'll put that on my to do list.

I'm sorry that I can't provide an easy solution at the moment.

Best, Jakob

genomesandMGEs commented 3 years ago

Thanks for the detailed explanation Russel!

Russel88 commented 2 years ago

CCTyper 1.6.2 now prints a GFF with all repeat and spacer coordinates