htgt / CRISPR-Analyser

C++ package for analysing CRISPR off targets
MIT License
21 stars 8 forks source link

Constrains/Indexes #4

Open coronin opened 10 years ago

coronin commented 10 years ago

Could you please elaborate more on the following psql database and usage?

ALTER TABLE ONLY crisprs_human ADD CONSTRAINT crisprs_mouse_unique_loci UNIQUE (chr_start, chr_name, pam_right);
CREATE INDEX idx_crisprs_mouse_loci ON crisprs_mouse USING btree (chr_name, chr_start);

Thanks!

ah19 commented 10 years ago

The constraint in that first line should be crisprs_human_unique_loci as it is on the human table, sorry! The constraint just means that you can't store duplicate CRISPRs in the table, only once crispr can exist at a given location. Having the pam_right field in there means a crispr can exist at the same chromosome start IF they are on different strands.

The index is just to make searching by genomic position quick, so if you do a query like this:

select * from crisprs_mouse where chr_name='10' and chr_start between 3000000 and 3020000;

finding all crisprs in a region is very fast