IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
227 stars 119 forks source link

Metadata attribute #126

Open znicholls opened 5 years ago

znicholls commented 5 years ago

In the climate community, we often read data from netCDF files. These files have a huge amount of metadata contained within them (when the file was written, who by, how etc. etc.). For SCMs, we often read these files in, take some average and then use them for whatever.

Given the usefulness of the IamDataFrame, we would want to read them into that format. However, in such a dataframe, there is nowhere obvious to store all the metadata.

Would we consider adding a metadata attribute, where we could store whatever we wanted (dict of metadata, strings) that we don't check in any way, but is simply provided as a catch all for users who need somewhere to store extra stuff. It doesn't belong in meta because 'Who wrote the file' isn't something we want in such a meta dataframe, I don't think?

danielhuppmann commented 5 years ago

I don't see why adding such attributes to the meta table would be problematic. In particular if you think about reading data from multiple sources (files) using append(), it would make sense to me. to being able to keep track of it there, immediately offering the option to filter a merged IamDataFrame back to its components using df.filter(source=file_name) or similar (assuming that the filename is added as a meta column during import).

znicholls commented 5 years ago

If you think it's possible that would be great. I'm just concerned about carrying around 50 columns (or more, magicc has 200 parameters that we would ideally store in this attribute) of metadata but maybe that concern is unfounded?

On Tue, 30 Oct 2018 at 2:08 pm, Daniel Huppmann notifications@github.com wrote:

I don't see why adding such attributes to the meta table would be problematic. In particular if you think about reading data from multiple sources (files) using append(), it would make sense to me. to being able to keep track of it there, immediately offering the option to filter a merged IamDataFrame back to its components using df.filter(source=file_name) or similar (assuming that the filename is added as a meta column during import).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IAMconsortium/pyam/issues/126#issuecomment-434291647, or mute the thread https://github.com/notifications/unsubscribe-auth/AWh-m7jB1wOrx0lf12s-sZp7H-w_LIi3ks5uqE81gaJpZM4X6mSK .

znicholls commented 5 years ago

@gidden @danielhuppman a little nudge on this one

gidden commented 5 years ago

Hey @znicholls, @rgieseke and I were discussing this during IAMC and had a thought. Given that there are massive metadata requirements (and perhaps also gridded data!), would it make more sense to develop a class for openscm data that is based on an xarray.dataset? We could develop a similar interface as pyam.IamDataFrame and/or just provide to/from utils to go back and forth?

gidden commented 5 years ago

@rgieseke and I also thought it would be a good idea to have a short monthly call to discuss efforts in slightly more detail. maybe this would be a good option for the first one?

znicholls commented 5 years ago

I think so. That would basically settle the question of creating a superclass of IamDataFrame which OpenSCMDataFrame could then subclass right? If yes, I'll put tidying up those MRs on the list to start to make room for the changes and give us a set of tests to help us keep track of where things are at as we switch the backend.

On Mon, 19 Nov 2018 at 2:55 pm, Matthew Gidden notifications@github.com wrote:

Hey @znicholls https://github.com/znicholls, @rgieseke https://github.com/rgieseke and I were discussing this during IAMC and had a thought. Given that there are massive metadata requirements (and perhaps also gridded data!), would it make more sense to develop a class for openscm data that is based on an xarray.dataset? We could develop a similar interface as pyam.IamDataFrame and/or just provide to/from utils to go back and forth?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/IAMconsortium/pyam/issues/126#issuecomment-439919959, or mute the thread https://github.com/notifications/unsubscribe-auth/AWh-m5wJ7lF0hHJtDloRlJx84keaqc1Yks5uwsZLgaJpZM4X6mSK .

znicholls commented 5 years ago

Yep sounds good. I'm assuming we want to try and get one in before Christmas?

On Mon, 19 Nov 2018 at 2:58 pm, Matthew Gidden notifications@github.com wrote:

@rgieseke https://github.com/rgieseke and I also thought it would be a good idea to have a short monthly call to discuss efforts in slightly more detail. maybe this would be a good option for the first one?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IAMconsortium/pyam/issues/126#issuecomment-439921254, or mute the thread https://github.com/notifications/unsubscribe-auth/AWh-m-7_yIreOv5UHGqOFcxvh5HJuDsYks5uwscjgaJpZM4X6mSK .

gidden commented 5 years ago

Would be happy to. I can try to make a doodle poll, but if I dally for too long, feel free to jump in =)

On Mon, Nov 19, 2018 at 4:00 PM Zeb Nicholls notifications@github.com wrote:

Yep sounds good. I'm assuming we want to try and get one in before Christmas?

On Mon, 19 Nov 2018 at 2:58 pm, Matthew Gidden notifications@github.com wrote:

@rgieseke https://github.com/rgieseke and I also thought it would be a good idea to have a short monthly call to discuss efforts in slightly more detail. maybe this would be a good option for the first one?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/IAMconsortium/pyam/issues/126#issuecomment-439921254 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AWh-m-7_yIreOv5UHGqOFcxvh5HJuDsYks5uwscjgaJpZM4X6mSK

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IAMconsortium/pyam/issues/126#issuecomment-439921894, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVAEYzvEEY1vObUeOn49yDPoFnTugfIks5uwseQgaJpZM4X6mSK .