RECETOX / galaxytools

Set of Galaxy tool wrappers developed at RECETOX
MIT License
13 stars 13 forks source link

Find a tool which enables filtering coordination complexes and mixtures from a list of SMILES #430

Closed hechth closed 10 months ago

hechth commented 11 months ago

Input for the tool should either be a full metadata table coming from the matchms metadata extractor or MSMetaEnhancer or a single column of SMILES or InChI (which can be created from the full table via the text manipulation tools).

The tool should then filter out all SMILES or InChI which are representing a mixture or a coordination complex. An example for a coordination complex is available here. It is probably a good idea to consult the literature on SMILES in Mendeley under "ChemicalIndentifiers" to get a better idea of how these compounds are represented with SMILES and to also run some local tests using BioTransformer Galaxy tool and the MS-Finder Galaxy tool. There is a paper on creating MS-Ready SMILES which should give an idea of which compounds are not suited for MS.

Mixtures are usually represented by having a . in the SMILES but there might be more ways.

Some of the functionality we need for this might already be implemented in the chemical toolxbox. So we need to either find a tool which does the job or we implement a new tool which does this and it could be based on Openbabel.

An example MSP file which should be used for the filtering and from which the metadata can be obtained using the matchms_metadata_export function is available for download.

step1_data.zip

wverastegui commented 10 months ago

The tool produces SMILES and InChI output files when provided with the respective identifiers as input files.