rdkit-rs / cheminee

The chemistry search stack
9 stars 0 forks source link

Scaffold-based indexing needed for quicker structure search #79

Closed JJ-Pineda closed 4 months ago

JJ-Pineda commented 4 months ago

Currently cheminee's substructure search can be a bit sluggish when searching among large indexes (e.g. 10+ seconds to search among 400,000+ compounds). This could be improved by incorporating scaffold-based indexing (i.e. each compound gets matched with scaffolds during the indexing step). This way, when a search is conducted, the search space will be significantly reduced via tantivy filtering.