IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
226 stars 118 forks source link

download a whole database snapshot #271

Closed byersiiasa closed 4 years ago

byersiiasa commented 4 years ago

Currently I don't find a way to pull from the server a whole database.

In a situation where all models report the same complete set of variables, then one could iterate through a list of models. But in cases where different models have reported different or incomplete sets, this isn't possible - Perhaps a whole new function would be best, like conn.query(model='*', variable='*', region='*')

A little similar to #137

byersiiasa commented 4 years ago

I will try a PR to edit the read_iiasa function, but for non-public dbs requires adding the credentials.

byersiiasa commented 4 years ago

Can someone (perhaps @zikolach or Peter Kolp), please check my branch. I've added the necessary fix to the read_iiasa function, but it is failing on something to do with sub-annual

Lines 344 and 361 of https://github.com/byersiiasa/pyam/blob/fix_read_iiasa/pyam/iiasa.py

df = pyam.read_iiasa('IXSE_AR6', user='byers', pw='*******')
INFO:root:You are connected to the IXSE_AR6 scenario explorer hosted by IIASA. If you use this data in any published format, please cite the data as provided in the explorer guidelines: https://data.ene.iiasa.ac.at/ar6-scenario-submission/#/about.
Traceback (most recent call last):

  File "<ipython-input-2-01a089b1c36c>", line 1, in <module>
    df = pyam.read_iiasa('IXSE_AR6', user='byers', pw='****')

  File "c:\users\byers\src\pyam\pyam\iiasa.py", line 358, in read_iiasa
    df = conn.query(**kwargs)

  File "c:\users\byers\src\pyam\pyam\iiasa.py", line 328, in query
    if pd.Series([i in [-1, 'year'] for i in df.subannual]).all():

  File "C:\Users\byers\Continuum\Anaconda3\envs\pyamTV\lib\site-packages\pandas\core\generic.py", line 5180, in __getattr__
    return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'subannual'
byersiiasa commented 4 years ago

BTW same occurs for the SR15 SE also. Sometimes before this error it also throws a similar error along lines of from the query() function. Can be reproduced by:

import pyam
df = pyam.read_iiasa('IXSE_AR6', user='byers', pw='****')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\GitHub\pyam\pyam\iiasa.py", line 363, in read_iiasa
    df = conn.query(**kwargs)
  File "C:\GitHub\pyam\pyam\iiasa.py", line 325, in query
    .drop(columns='runId')
  File "C:\Users\byers\AppData\Local\Continuum\anaconda3\envs\pyam_dev\lib\site-packages\pandas\core\frame.py", line 3940, in drop
    errors=errors)
  File "C:\Users\byers\AppData\Local\Continuum\anaconda3\envs\pyam_dev\lib\site-packages\pandas\core\generic.py", line 3780, in drop
    obj = obj._drop_axis(labels, axis, level=level, errors=errors)
  File "C:\Users\byers\AppData\Local\Continuum\anaconda3\envs\pyam_dev\lib\site-packages\pandas\core\generic.py", line 3812, in _drop_axis
    new_axis = axis.drop(labels, errors=errors)
  File "C:\Users\byers\AppData\Local\Continuum\anaconda3\envs\pyam_dev\lib\site-packages\pandas\core\indexes\base.py", line 4965, in drop
    '{} not found in axis'.format(labels[mask]))
KeyError: "['runId'] not found in axis"

This can be skipped by adding an if statement around line 323 to skip if runID not there (not sure if that is a good thing or not...)

e.g.

df = (pd.read_json(r.content, orient='records'))
if 'runId' in df.columns:
        df.drop(columns='runId', inplace=True)
df.rename(columns={'time': 'subannual'}, inplace=True)
danielhuppmann commented 4 years ago

@byersiiasa @zikolach in my opinion, this issue can be closed - correct?

zikolach commented 4 years ago

I think so, @byersiiasa could you please check if everything works for you?