A new module was added to RDMC featurizer inspired by my recent work with chemprop featurizer. So far, only fingerprints supported by RDKit have been added, namely Morgan, atom pair, topological torsion, and RDKitFP. Hopefully, molgraph and condensed graph of reaction can be added in the future when I have time.
The addition enables a simple API call to query different fingerprints utilizing Chem.rdFingerprintGenerator.
P.S. I found there is another popular way of implementing fingerprint calculation, e.g.:
def get_morgan_fingerprint(mol, radius=2, n_bits=1024):
features_vec = AllChem.GetHashedMorganFingerprint(mol, radius=radius, nBits=n_bits)
features = np.zeros((1,))
DataStructs.ConvertToNumpyArray(features_vec, features)
return features
I did a quick comparison between the above implementation and the implementation using rdFingerprintGenerator; the previous one doesn't scale as well as the later one with increasing fpSize. The difference is neglectable for 1024, but the former one takes almost 2x time for 2048.
A new module was added to RDMC
featurizer
inspired by my recent work with chemprop featurizer. So far, only fingerprints supported by RDKit have been added, namely Morgan, atom pair, topological torsion, and RDKitFP. Hopefully, molgraph and condensed graph of reaction can be added in the future when I have time.The addition enables a simple API call to query different fingerprints utilizing
Chem.rdFingerprintGenerator
.P.S. I found there is another popular way of implementing fingerprint calculation, e.g.:
I did a quick comparison between the above implementation and the implementation using
rdFingerprintGenerator
; the previous one doesn't scale as well as the later one with increasing fpSize. The difference is neglectable for 1024, but the former one takes almost 2x time for 2048.