rdkit / rdkit-js

A powerful cheminformatics and molecule rendering toolbelt for JavaScript, powered by RDKit .
https://rdkitjs.com
BSD 3-Clause "New" or "Revised" License
159 stars 36 forks source link

Properties are not extracted from SDF files #489

Open nishanthmerwin opened 2 months ago

nishanthmerwin commented 2 months ago

Describe the bug I am trying to ingest an SDF file with RDKit JS and am running into an issue where the properties tagged on an SDF file are missing

To Reproduce

Consider the following code:

const molString = ` 
ww   csweb09162414372D 0   0.00000     0.00000

  9  9  0  0  0  0  0  0  0  0999 V2000
  137.0000  319.8763    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  108.1690  305.9919    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  101.0483  274.7943    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  121.0000  249.7757    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  153.0000  249.7757    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  172.9517  274.7943    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  165.8310  305.9919    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  242.0000  338.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  280.0000  298.0000    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  2  3  1  0  0  0  0
  3  4  1  0  0  0  0
  4  5  1  0  0  0  0
  5  6  1  0  0  0  0
  1  7  1  0  0  0  0
  6  7  1  0  0  0  0
  7  8  1  0  0  0  0
  8  9  1  0  0  0  0
M  END

> <some_property>
0

> <PUBCHEM_COMPONENT_COUNT>
1

$$$$
`
const mol = window.RDKit.get_mol("\n" + molSDF + "$$$$\n");
const keys: string[] = mol.get_prop_list();
const metadata = {};
for (var key in keys) {
  console.log("mol." + key + " = " + mol.get_prop(key));
  metadata[key] = mol.get_prop(key);
}
console.log(metadata);

Expected behavior

I would expect to see the properties "some_property" and "PUBCHEM_COMPONENT_COUNT", but instead these are missing.

Screenshots

Output of metadata in the example:

image

Version

This issue was noticed in RDKit version: 2024.03.5

Additional context This is maybe medium priority for me, my current workaround is just to manually extract the properties outside of RDKit. It would be nice to update get_mol but it would also be even more useful to export functionality from Chem.SDMolSupplier as defined here: https://www.rdkit.org/docs/GettingStartedInPython.html#reading-sets-of-molecules

ptosco commented 2 months ago

This is not a bug. get_mol will only parse the molblock up to the M END tag, as it mimics the functionality of the Python function Chem.MolFromMolBlock rather than from Chem.SDMolSupplier.

nishanthmerwin commented 2 months ago

@ptosco is there an alternative in rdkitjs that would allow me to get the properties?

ptosco commented 2 months ago

No, at the moment there isn’t one.