Currently, our "database" of protein sequences is just a simple JSON file saved to disk. Search requests are done naively as a linear search over the JSON array of protein sequences.
Annoy is a simple library for storing and searching high-dimensional vectors (i.e. of the kind of our proteins embeddings) in a performant way.
We can integrate Annoy in our backend to improve the search performance and scale to more and more proteins sequences.
Currently, our "database" of protein sequences is just a simple JSON file saved to disk. Search requests are done naively as a linear search over the JSON array of protein sequences.
Annoy is a simple library for storing and searching high-dimensional vectors (i.e. of the kind of our proteins embeddings) in a performant way.
We can integrate Annoy in our backend to improve the search performance and scale to more and more proteins sequences.