Closed JHucker closed 3 years ago
Hi! By default all file readers skip data with errors. For SMILESRead extra checks are made:
You can skip these checks by passing ignore argument.
SMILESRead(fname, header=True, ignore=True)
However, SMILESRead supports only simple cases of metadata parsing. For you it is better to use next pipeline:
for record in csv.DictReader(io.StringIO(example_text), delimiter='\t'):
record['reaction'] = smiles(record['ReactionSmiles'])
But you have in example smiles not covered by opensmiles spec:
[N+](C1C=CC=C2C=1C=CN=C2)([O-])=O.[CH3:14][C:15]1[C:24]2[C:19](=[CH:20][CH:21]=[CH:22][CH:23]=2)[CH:18]=[CH:17][N:16]=1.Br.[Cl:26][C:27]1[C:32]([OH:33])=[C:31]([Cl:34])[C:30]([Cl:35])=[C:29]([Cl:36])[C:28]=1[Cl:37]>>[Cl:26][C:27]1[C:32]([O-:33])=[C:31]([Cl:34])[C:30]([Cl:35])=[C:29]([Cl:36])[C:28]=1[Cl:37].[CH3:14][C:15]1[C:24]2[C:19](=[CH:20][CH:21]=[CH:22][CH:23]=2)[CH:18]=[CH:17][NH+:16]=1 |f:4.5|
|f:4.5|
- information about components contracting.
This data not supported for now. This feature in todo list.
Both of those solutions work great and noted re the extended SMILES functionality, thanks for your assistance.
On delimited data, having issues with SMILESRead i.e. upon calling read(), only a fraction of the results are returned. However, manually iterating over the same with smiles() generally returns them all. While I can use smiles() as a workaround, it would be great to use SMILESRead for parsing the other columns in as metadata.
See below example (using python 3.8.10 and CGRtools 4.1.20):