issues
search
benjaminsaljooghi
/
prospector
High-performance CRISPR-Cas discovery.
BSD 3-Clause "New" or "Revised" License
1
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Feat/benchmarking parasability
#83
zachnicoll
opened
2 years ago
0
Chore/make prospector compilable
#82
zachnicoll
closed
2 years ago
0
Genome-wide performance degradation could be solved by checking for all redundant profiles and removing them. Or at least separating the profiles so that there's a "critical" list and a "extra sensitivity" list that includes all profiles. As well as other additional performance improvement considerations.
#81
benjaminsaljooghi
opened
3 years ago
0
Content erasure offset problem potential
#80
benjaminsaljooghi
opened
3 years ago
0
bug: GCF_001544635.1_ASM154463v1_genomic
#79
benjaminsaljooghi
opened
3 years ago
0
Array::mutant may generate a lot of CRISPRs and large anomalous CRISPRs given the large chunk size.
#78
benjaminsaljooghi
opened
4 years ago
1
Differential codon expansion predicated on domain organization of a gene. First domain searches for start, final domain searches for stop. The "domains" themselves should not be expanded, only the overarching gene.
#77
benjaminsaljooghi
opened
4 years ago
0
CLI args
#76
benjaminsaljooghi
opened
4 years ago
0
option to prune crisprs if no associated cas were found
#75
benjaminsaljooghi
opened
4 years ago
0
reduce domain sensitivity by including mean profile length in the profiles, and requiring that the discovered domain must be less than twice the size of the profile.
#74
benjaminsaljooghi
opened
4 years ago
0
potential problem with GCA_000011125 where cas4 actually exists in multiple separate regions of the translation
#73
benjaminsaljooghi
opened
4 years ago
0
CMake
#72
benjaminsaljooghi
closed
3 years ago
1
Create LICENSE
#71
benjaminsaljooghi
closed
4 years ago
0
if the crispr is contained within the translation (it normally will be) then strip it out before computing the translation
#70
benjaminsaljooghi
opened
4 years ago
0
compute translation in parallel by first pre-computing translation positions sequentially
#69
benjaminsaljooghi
opened
4 years ago
0
Containerization
#68
benjaminsaljooghi
opened
4 years ago
0
Subtype classification
#67
benjaminsaljooghi
opened
4 years ago
0
Hash table memory efficiency improvement via standard hash protocol with linked list for collisions. Acceptable because we're on CPU instead of CUDA.
#66
benjaminsaljooghi
closed
4 years ago
0
Accuracy evaluation with more genomes (assessing CRISPRs and Cas genes)
#65
benjaminsaljooghi
closed
4 years ago
0
More Cas profiles
#64
benjaminsaljooghi
closed
3 years ago
0
Allow upstream AND downstream detection of Cas genes
#63
benjaminsaljooghi
closed
4 years ago
0
Remove duplicate S in amino encoding
#62
benjaminsaljooghi
closed
4 years ago
0
CRISPR classification based on Cas composition
#61
benjaminsaljooghi
closed
4 years ago
0
Parameterize map_size_small rather than #define
#60
benjaminsaljooghi
closed
4 years ago
0
Pin memory at beginning that is then reused between genomes. Allocate large enough pinned memory to store the genome and qmaps. Can assume all genomes will be <= 5 MB for example, either way, pinning up to like 100 MB in total isn't too much
#59
benjaminsaljooghi
opened
4 years ago
1
in qmap3000 it's not necessary to compute the redundant 16-mers. This redundancy can be eliminated by using the map_size (3000) to determine if there will be any "overlapping" queries and to use therefore the earliest query if there is an overlap. Issue with this is that query_a will get its 3000, but query_b (contained within the 3000) may have additional targets beyond the query_a + 3000.
#58
benjaminsaljooghi
closed
4 years ago
0
Memory management - struct constructors and destructors
#57
benjaminsaljooghi
opened
4 years ago
0
multi-genome execution -> GPU runs async to CPU (e.g. compute qmaps for all genomes while CPU sequentially processes qmap results)
#56
benjaminsaljooghi
opened
4 years ago
0
CUDA Cas gene detection
#55
benjaminsaljooghi
closed
4 years ago
0
CUDA crispr collection via qmap3000 post qmap64
#54
benjaminsaljooghi
closed
4 years ago
0
Prospector rework
#53
benjaminsaljooghi
closed
4 years ago
0
left-aligned AND right-aligned spacer similarity score
#52
benjaminsaljooghi
closed
4 years ago
0
rework spacer conservation such that the total length is considered, this will optimize for 1577775 - 1578027 30 thermophilus
#51
benjaminsaljooghi
closed
4 years ago
0
2-bit encoding rather than 4-bit encoding for crispr comparison
#50
benjaminsaljooghi
closed
4 years ago
0
int-encoded comparisons for crispr discovery
#49
benjaminsaljooghi
closed
4 years ago
0
PAM annotation by building a motif of the kmers adjacent to protospacers found in the BLAST results of spacer sequences
#48
benjaminsaljooghi
opened
4 years ago
0
require new mismatch ratio of <0.8 in order to get 1577775 - 1578027 30 (thermohpilus). Therefore. Consider removing the entire dyad enumeration and just relying on mutant checking in the genome. This means that we save a fair bit on memory with regards to the number of dyads we send to the crispr kernel. Rather we just end up working with the genome and each cuda thread writing crispr arrays to its buffer based on mutant checks. I would expect this to actually improve the performance because here there's no need to do the dyad checks.
#47
benjaminsaljooghi
closed
4 years ago
0
Thesis updates
#46
benjaminsaljooghi
closed
4 years ago
0
Look for Cas genes upstream AND downstream of the CRISPR
#45
benjaminsaljooghi
closed
4 years ago
0
PAM annotation by building a motif of the kmers adjacent to protospacers found in the BLAST results of spacer sequences
#44
benjaminsaljooghi
closed
4 years ago
0
Compare protein domains of Cas nucleases
#43
benjaminsaljooghi
closed
4 years ago
0
Directionality consideration involves analyzing the opposite strand of the given genome
#42
benjaminsaljooghi
closed
4 years ago
0
Locus merging problem
#41
benjaminsaljooghi
closed
4 years ago
1
Improve BLAST formatted output so that it produces a simialr result to BLAST+
#40
benjaminsaljooghi
closed
4 years ago
0
Transform blast demo app into a blast lib. Compile and link from an invoking app that uses both blast and prospector as libraries
#39
benjaminsaljooghi
closed
4 years ago
1
Move NCBI dirs to a new branch
#38
benjaminsaljooghi
closed
5 years ago
0
Remove archive dir
#37
benjaminsaljooghi
closed
5 years ago
1
Complete integration of BLAST with prospector. BLAST should report a spacer similarity score for each CRISPR array returned by prospector.
#36
benjaminsaljooghi
closed
4 years ago
1
Extend basic NCBI app to invoke existing CUDA code (prospector)
#35
benjaminsaljooghi
closed
5 years ago
1
Build basic NCBI application
#34
benjaminsaljooghi
closed
5 years ago
0
Next