IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
226 stars 118 forks source link

Hotfix AR6 database reading #649

Closed znicholls closed 2 years ago

znicholls commented 2 years ago

Please confirm that this PR has done the following:

Adding to RELEASE_NOTES.md (remove section after adding to RELEASE_NOTES.md)

Please add a single line in the release notes similar to the following:

- (#XX)[http://link-to-pr.com] Added feature which does something

Description of PR

The downloader doesn't appear happy with the AR6 database, see below. This PR (firstly suggests a fix before hopefully) fixes that.

df = pyam.read_iiasa(
    "ar6-public",
    model="*",
    variable=["*Emissions|CO2*"],
    region="World",
    #     meta=['category']
)

>>> ValueError                                Traceback (most recent call last)
...
venv/lib/python3.7/site-packages/fsspec/registry.py in get_filesystem_class(protocol)
    214     if protocol not in registry:
    215         if protocol not in known_implementations:
--> 216             raise ValueError("Protocol not known: %s" % protocol)
    217         bit = known_implementations[protocol]
    218         try:

ValueError: Protocol not known: [{"model":"AIM/CGE 2.0","scenario":"ADVANCE_2020_Med2C","scheme":null,"annotation":"import scenario data via backdoor import","metadata":{"Regional_scope":"global","Time horizon":2.1E+3,"Median peak warming (FaIRv1.6.2)":1.7729750809289970270299363619415089488,"Year of peak CO2 Emissions (Harm-Infilled)":2.02E+3,"CH4 emissions reductions 2019-2050 % modelled Harmonized-Infilled":50.97872611551429145038127899169921875,"Median year of peak warming (FaIRv1.6.2)":2.1E+3,"Cumulative net-negative CO2 (post net-zero, Gt CO2) (Harm-Infilled)":0,"Median peak warming (MAGICCv7.5.3)":1.84227485889352804449003997433464974164,"Ssp_family":2,"Scenario_scope":"Global integrated scenario","GHG emissions 2030 Gt CO2-equiv/yr (Harmonized-Infilled)":30.86722290191544004755996866151690483093,"Cumulative net CO2 (2020-2100, Gt CO2) (Harm-Infilled)":1280.2565473162289890751708298921585083,"Category_FaIRv1.6.2":"C3","GHG emissions 2050 Gt CO2-equiv/yr (Harmonized-Infilled)":25.17110554908926900452570407651364803314,"Category_Vetting_historical":"failed_Vetting_historical_C4","Vetting_future":"Warning","Sectoral_scope":"integrated/energy-emissions","Category_subset":"C4","GDP|MER-per-capita-in-2100_bin":"Low","CO2 emissions 2100 Gt CO2/yr":7.75843819999999961822823024704121053218,"Population-in-2100":8990.633099999999103602021932601928710937,"Policy_category":"P2a","Exceedance Probability 2.0C (MAGICCv7.5.3)":0.3433333333333333237114004532486433163285,"Peak Emissions|CO2":43.94327670000001262451405636966228485107,"Exceedance Probability 2.0C (FaIRv1.6.2)":0.260169870362092103821538557895109988749,"Population-in-2100_bin":"Medium","GDP|MER-per-capita-in-2100":34800.374625453237968031316995620727539,"Project_study":"ADVANCE","Exceedance Probability 1.5C (FaIRv1.6.2)":0.8207420652659812576601439104706514626741,"CO2 emissions 2030 Gt CO2/yr":23.87238299999999924239091342315077781677,"Peak Emissions|GHGs":57.42237498583359212034338270314037799835,"Technology_category_name":"T0: Standard technology assumptions","Literature Reference (if applicable)":"https
znicholls commented 2 years ago

If I could get a hand adding a test that reading from the AR6 database behaves as intended that would be great

znicholls commented 2 years ago

iam-units is waiting on https://github.com/IAMconsortium/units/pull/37, tests won't pass until that happens

znicholls commented 2 years ago

I think the error is due to a change in pandas (https://stackoverflow.com/questions/63553845/pandas-read-json-valueerror-protocol-not-known), although I'm a bit puzzled why there is no issue with reading from the SR1.5 database as I thought the same error would occur... You could try those casting to string type solutions first, or you could just use r.json() rather than r.text as suggested here.

gidden commented 2 years ago

cc @danielhuppmann @phackstock and @byersiiasa

coroa commented 2 years ago

I think the error is due to a change in pandas (https://stackoverflow.com/questions/63553845/pandas-read-json-valueerror-protocol-not-known), although I'm a bit puzzled why there is no issue with reading from the SR1.5 database as I thought the same error would occur... You could try those casting to string type solutions first, or you could just use r.json() rather than r.text as suggested here.

I think the change you are refering to has been addressed in #621 ; BUT: i am not sure. ie the content attribute is a bytes string, but text has already been decoded. I'll have a short look whether i'll figure out what i think is the problem instead.

coroa commented 2 years ago

I don't encounter any error with the current main branch. Are you sure it has not been fixed by #621 ? @znicholls

coroa commented 2 years ago

Had a second look. I am getting the same problem, you are reporting with python 3.7. Starting from python 3.8, it works fine. I guess using the json method from requests would be getting around that as you propose here.

znicholls commented 2 years ago

It's not a big issue, I'll just a Python 3.9 environment. Thanks for taking a look!