mcs07 / PubChemPy

Python wrapper for the PubChem PUG REST API.
http://pubchempy.readthedocs.io
MIT License
380 stars 106 forks source link

Fetch Solubility data #16

Closed rraadd88 closed 7 years ago

rraadd88 commented 7 years ago

I want to get Solubility data (4.2.7) for a list of compounds. get_properties fetches 4.1 Computed Properties but I am not sure how to get 4.2 Experimental Properties.

Please let me know what would be the best way to fetch this field. Thanks in advance.

mcs07 commented 7 years ago

I don't believe solubility data is available via the PUG REST API: https://pubchem.ncbi.nlm.nih.gov/pug_rest/PUG_REST.html

There is an undocumented API that is used to assemble the PubChem compound web pages: https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/2244/JSON

This look like it has the info you want. You could do something like:

import requests
r = requests.get('https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/2244/JSON')
for section in r.json()['Record']['Section']:
    if section.get('TOCHeading') == 'Chemical and Physical Properties':
        for subsection in section['Section']:
            if subsection['TOCHeading'] == 'Experimental Properties':
                for subsubsection in subsection['Section']:
                    if subsubsection['TOCHeading'] == 'Solubility':
                        print(subsubsection)
rraadd88 commented 7 years ago

Works great! Thanks a lot.

def fetch_water_solubility(CID):
    import requests
    r = requests.get('https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/%s/JSON' % CID)
    for section in r.json()['Record']['Section']:
        if section.get('TOCHeading') == 'Chemical and Physical Properties':
            for subsection in section['Section']:
                if subsection['TOCHeading'] == 'Experimental Properties':
                    for subsubsection in subsection['Section']:
                        if subsubsection['TOCHeading'] == 'Solubility':
    #                         print(subsubsection)
                            for s3section in subsubsection["Information"]:
                                if s3section["Name"] == "Water Solubility":
                                    return s3section["NumValue"],s3section["ValueUnit"]