mcs07 / PubChemPy

Python wrapper for the PubChem PUG REST API.
http://pubchempy.readthedocs.io
MIT License
379 stars 106 forks source link

How to retrieve compounds' physical property? #46

Open Yaoyx opened 4 years ago

Yaoyx commented 4 years ago

Is there any function in PubChemPy that can return compounds' physical properties?

BalooRM commented 4 years ago

The PubChemPy documentation (https://pubchempy.readthedocs.io/en/latest/guide/properties.html) outlines the properties that can be retrieved. It may not cover all of the physical properties that you seek.

The get_properties function allows the retrieval of specific properties without having to deal with entire compound records. This is especially useful for retrieving the properties of a large number of compounds at once:

p = pcp.get_properties('IsomericSMILES', 'CC', 'smiles', searchtype='superstructure')

Multiple properties may be specified in a list, or in a comma-separated string. The available properties are: MolecularFormula, MolecularWeight, CanonicalSMILES, IsomericSMILES, InChI, InChIKey, IUPACName, XLogP, ExactMass, MonoisotopicMass, TPSA, Complexity, Charge, HBondDonorCount, HBondAcceptorCount, RotatableBondCount, HeavyAtomCount, IsotopeAtomCount, AtomStereoCount, DefinedAtomStereoCount, UndefinedAtomStereoCount, BondStereoCount, DefinedBondStereoCount, UndefinedBondStereoCount, CovalentUnitCount, Volume3D, XStericQuadrupole3D, YStericQuadrupole3D, ZStericQuadrupole3D, FeatureCount3D, FeatureAcceptorCount3D, FeatureDonorCount3D, FeatureAnionCount3D, FeatureCationCount3D, FeatureRingCount3D, FeatureHydrophobeCount3D, ConformerModelRMSD3D, EffectiveRotorCount3D, ConformerCount3D.

Yaoyx commented 4 years ago

The PubChemPy documentation (https://pubchempy.readthedocs.io/en/latest/guide/properties.html) outlines the properties that can be retrieved. It may not cover all of the physical properties that you seek.

The get_properties function allows the retrieval of specific properties without having to deal with entire compound records. This is especially useful for retrieving the properties of a large number of compounds at once:

p = pcp.get_properties('IsomericSMILES', 'CC', 'smiles', searchtype='superstructure')

Multiple properties may be specified in a list, or in a comma-separated string. The available properties are: MolecularFormula, MolecularWeight, CanonicalSMILES, IsomericSMILES, InChI, InChIKey, IUPACName, XLogP, ExactMass, MonoisotopicMass, TPSA, Complexity, Charge, HBondDonorCount, HBondAcceptorCount, RotatableBondCount, HeavyAtomCount, IsotopeAtomCount, AtomStereoCount, DefinedAtomStereoCount, UndefinedAtomStereoCount, BondStereoCount, DefinedBondStereoCount, UndefinedBondStereoCount, CovalentUnitCount, Volume3D, XStericQuadrupole3D, YStericQuadrupole3D, ZStericQuadrupole3D, FeatureCount3D, FeatureAcceptorCount3D, FeatureDonorCount3D, FeatureAnionCount3D, FeatureCationCount3D, FeatureRingCount3D, FeatureHydrophobeCount3D, ConformerModelRMSD3D, EffectiveRotorCount3D, ConformerCount3D.

Thank you so much for your help. What I am looking for is the odor of compounds. It seems PubChemPy does not include it. Do you know any other way to retrieve such kind of information from PubChem website? Once again, thanks for your help.

khoivan88 commented 4 years ago

I wrote a script to get out pKa of compounds on Pubchem. Because pKa and odor are in the same category, I think the code to get odor would be quite similar. You just need to manipulate the xml to get out odor. Here is my script: https://www.github.com/khoivan88/pka_lookup/tree/master/src%2Fpka_lookup_pubchem.py

Specifically, a GET request to this url can get you the "Odor" info: https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/{}/XML?heading=Odor

You could use the above link to replace the url on line 136 and then just play with the return xml to get back the info you want.

Yaoyx commented 4 years ago

I wrote a script to get out pKa of compounds on Pubchem. Because pKa and odor are in the same category, I think the code to get odor would be quite similar. You just need to manipulate the xml to get out odor. Here is my script: https://www.github.com/khoivan88/pka_lookup/tree/master/src%2Fpka_lookup_pubchem.py

Specifically, a GET request to this url can get you the "Odor" info: https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/{}/XML?heading=Odor

You could use the above link to replace the url on line 136 and then just play with the return xml to get back the info you want.

Thank you so much!