IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
221 stars 115 forks source link

read_iiasa() - filter also by meta information #789

Open bs538 opened 8 months ago

bs538 commented 8 months ago

It would be helpful if read_iiasa() could also filter for meta information like climate assessment category. Use case: I tried to download data for all C1 scenarios from the AR6-DB. This works with the below, but when extending the query to more than just global and selected variables the memory requirements get very large. Adding the option to add meta information to the query (as commented line) would remove the need to first download and store a large file in memory.

df = pyam.read_iiasa("ar6-public",
    region = "World",
    variable = ['Emissions|CO2', 'Primary Energy|Coal'],
    # Category = "C1",
    meta=['Category', "IMP_marker"])

df.filter(Category="C1", inplace=True)

(With thanks to @jkikstra for confirming that this is currently not possible.)

danielhuppmann commented 8 months ago

Indeed, this was possible (I believe) but was lost a while back due to a regression in the RestAPI of the ixmp package.

We will implement this feature as part of the migration to the ixmp4 package for hosting IIASA scenario databases, see http://docs.ece.iiasa.ac.at/ixmp4

rongqizhu commented 2 months ago

I have the same needs as you. Following your example, I tried to use a filter to categorize, as shown. But I have two question:

  1. Why the filtered IamDataFrame can not be assigned to the new variable, It seems to be an empty table.
  2. How can I filter out multiple categories at once, like df.filter(Category=["C1","C2","C3"], inplace=True) @danielhuppmann, @jkikstra Some ideas about addressing it? Thank you! image
danielhuppmann commented 2 months ago

This is an unrelated question to the initial issue, and this is purely a mistake in your python code. You use the argument inplace=True, so the filter-operation happens directly on ar6_gas (which as you can see in the following cell now only has the C1 category).

When you call ar6_gas.filter(Category="C1", inplace=True), the method does not return any object, hence ar6_gas_c1 is None - see the error message in the last cell of the screenshot.

This behavior directly follow the equivalent pandas behavior for inplace, so please read the docs there.

rongqizhu commented 2 months ago

This is an unrelated question to the initial issue, and this is purely a mistake in your python code. You use the argument inplace=True, so the filter-operation happens directly on ar6_gas (which as you can see in the following cell now only has the C1 category).

When you call ar6_gas.filter(Category="C1", inplace=True), the method does not return any object, hence ar6_gas_c1 is None - see the error message in the last cell of the screenshot.

This behavior directly follow the equivalent pandas behavior for inplace, so please read the docs there.

Thank you for your reply. It is my negligence. I have already solved this problem.