IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
222 stars 116 forks source link

what should behavior of filtering `variables` function with level argument be? #4

Closed gidden closed 6 years ago

gidden commented 6 years ago

See entries 10 and 11 in the IAMC presentation, reproduced below:

df.variables(filters={'variable': 'Emissions|*', 'level': 2})

['Emissions|CO2',
 'Emissions|CO2|Fossil Fuels and Industry',
 'Emissions|CO2|Fossil Fuels and Industry|Energy Supply']

and

df.variables(filters={'level': 1})

['Emissions|CO2', 'Price|Carbon', 'Primary Energy', 'Primary Energy|Coal']
gidden commented 6 years ago

The current behavior gives all entries between the search regex and the level specified. My initial thought is that it should give only the values at the level of the search regex. I.e., the following should be true:

df.variables(filters={'variable': 'Emissions|*', 'level': 2})

['Emissions|CO2|Fossil Fuels and Industry|Energy Supply']

and

df.variables(filters={'level': 1})

['Emissions|CO2', 'Price|Carbon', 'Primary Energy|Coal']

Let's hold this on ice and revisit later

gidden commented 6 years ago

Although it is perhaps an opportune time to discuss this issue @danielhuppmann.

My thought is that you want as precise as possible output from your function. Here, if you say filter for level=2, then you should only get variables like foo|bar|baz, and not foo|bar. We could perhaps add some additional logic like level='2-' and level='2+' meaning "two or less" and "two or more" respectively.

Maybe it would be best to hear the use cases you've found yourself in most frequently to figure out what (if anything) to do.

danielhuppmann commented 6 years ago

The main use case for me was the exploration which variables existed in the snapshot, and to prevent the list from becoming too long when simply filtering for foo*. I would first do foo, level=1, then foo|bar, level=1 to iteratively explore the variable tree, and then copy-paste the relevant variables into a list (don't judge)...

Given that no clear hierarchical structure is enforced in IAMC-style variables in general, doing more fancy options like aggregating the total of foo|bar by writing foo|bar|*, level=0 is not going to work, I fear...

gidden commented 6 years ago

Ok, I think the pr may still meet your needs. Let me know what you think.. probably best to look at the notebook

On Thu, Feb 22, 2018, 17:48 Daniel Huppmann notifications@github.com wrote:

The main use case for me was the exploration which variables existed in the snapshot, and to prevent the list from becoming too long when simply filtering for foo. I would first do foo, level=1, then foo|bar, level=1 to iteratively explore the variable tree, and then copy-paste the relevant variables into a list (don't judge*)...

Given that no clear hierarchical structure is enforced in IAMC-style variables in general, doing more fancy options like aggregating the total of foo|bar by writing foo|bar|*, level=0 is not going to work, I fear...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IAMconsortium/pyam-analysis/issues/4#issuecomment-367744029, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVAEXWneGepH5f-JU-XBew8eF24Bp4Qks5tXZpGgaJpZM4RQl-R .