htgt / CRISPR-Analyser

C++ package for analysing CRISPR off targets
MIT License
21 stars 8 forks source link

GRCh38_crisprs.bin filetype #13

Open mcfa77y opened 8 years ago

mcfa77y commented 8 years ago

I would like to use this project but I need some information on how to get/create the bin files e.g. GRCh38_crisprs.bin is it just a genbank file that has been compressed? thanks, Joe

dparrysmith commented 8 years ago

Hi Joe,

Thanks for your interest in using the CRISPR-Analyser.

We have recently made our index files available on our ftp server:

ftp://ftp.sanger.ac.uk/pub/teams/229/crispr_indexes/

Take a look and see if they work for you.

Let us know how it goes.

David

David Parry-Smith PhD Senior LIMS Developer & Group Leader Stem Cell Informatics Wellcome Trust Sanger Institute

Office: 01223 834244 ext 8782 Mobile: 07711 190798

On 22 Mar 2016, at 17:58, Joe Lau notifications@github.com wrote:

I would like to use this project but I need some information on how to get/create the bin files e.g. GRCh38_crisprs.bin is it just a genbank file that has been compressed? thanks, Joe

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub

The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

dparrysmith commented 8 years ago

Joe,

The documentation on creating the bin files is here: https://github.com/htgt/CRISPR-Analyser#find-all-crisprs-within-the-genome

It is not a compressed Genbank format. A fasta file of your species genome is processed for CRISPR sites and this is then indexed to create the binary file.

Get back to me if this does not help.

David

mcfa77y commented 8 years ago

Thank you for that ftp link! Is there a way to make new bin files from different genomes e.g. e. coli, soy, etc?

dparrysmith commented 8 years ago

We don’t have plans to make new species bin files but you can do it using the information in the README.md file in this repo.

Check this section: https://github.com/htgt/CRISPR-Analyser#find-all-crisprs-within-the-genome

David

David Parry-Smith PhD Senior LIMS Developer & Group Leader Stem Cell Informatics (Team 229) Wellcome Trust Sanger Institute, UK (+44) 01223 834244 ext 8782

On 23 Mar 2016, at 15:08, Joe Lau wrote:

Thank you for that ftp link! Is there a way to make new bin files from different genomes e.g. e. coli, soy, etc?


You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/htgt/CRISPR-Analyser/issues/13#issuecomment-200386052

The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

mcfa77y commented 8 years ago

Under the Create index section what is the -a The assembly supposed to be a string? What is it's significance?

af11-sanger commented 8 years ago

Hello Joe,

The -a option is a string which specifies the assembly of the genome you are using to generate your crispr index. For example we have used human assembly GRCh38. Having looked at the code I think this option is only used to record the assembly information within the index file - all queries of the index are done by species name rather than assembly name.

Anna

mcfa77y commented 8 years ago

Great thank you for getting back to me!