basf / MolPipeline

MIT License
89 stars 4 forks source link

Use `SparseBitVect` to construct `csr_matrix` #54

Open c-w-feldmann opened 1 month ago

c-w-feldmann commented 1 month ago

From this comment:
It could be worth doing an experiment to check if the explicit bit vector from GetFingerprintAsNumpy can be replaced by a call to GetSparseFingerprint which returns a SparseBitVect that is much smaller than the explicit version. In this way much less temporary memory would need to be allocated. However, I am not sure if the csr_matrix constructor supports the SparseBitVect. In addition, this more memory efficient way is probably slower than using numpy arrays.