Extract groups of compounds which are derivatization products of one another.

We can use SMARTS to remove everything that has been derivatized.

We can predict derivatization for the compounds which have no derivatization. We can use the metadata matching or subsetting tool to see if those smiles are present in the library which should give exact matchms. This step needs to be improved by using a smiles harmonization or canonicalization step. This could possible be implemented with openbabel.

The next step is to use the fingerprint similarity to pull out very similar compounds.

RECETOX / galaxytools

Extract groups of compounds which are derivatization products of one another. #449