IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
227 stars 118 forks source link

Next iteration at a fast format_data #727

Closed coroa closed 1 year ago

coroa commented 1 year ago

Sits on top of #726. I, at first, added it against @gidden's repo (https://github.com/gidden/pyam/pull/9), but that is not visible enough i think. I am fine with closing that.

Refactoring of the format_data function with internal fast-path.

Basically, if one passes a dataframe or series with a multi-index which includes index and all required columns it will avoid reset_indexing it.

@gidden Not sure, how to create those profile pictures. Would be nice to include this variant!

Please confirm that this PR has done the following:

Adding to RELEASE_NOTES.md (remove section after adding to RELEASE_NOTES.md)

Please add a single line in the release notes similar to the following:

- (#XX)[http://link-to-pr.com] Added feature which does something

Description of PR

Please describe the changes introduced by this PR.

coroa commented 1 year ago

CI failures are that plot dimensions seem to be a few pixels off between the 3.9 runs and the now 3.8 runs. (I suppose, we should revert the switch to 3.8).

danielhuppmann commented 1 year ago

Thanks @coroa & @gidden for taking this forward.

Two comments:

  1. There is a reasonably well documented profiling-module in pyam which includes cpu time and memory usage - I think it would be much more efficient to build on top of that rather than reinventing the wheel. See the ReadMe.
  2. I'm not convinced that all the extra effort for a fast-pass (also maintenance going forward) is worth the effort if we get a threefold performance improvement simply by switching to stack in the current functions and not converting Series to Frames. I'd feel more comfortable tackling that in two separate PRs (and I'm happy to take the lead in the first one).
coroa commented 1 year ago

Closing in favour of #729 together with #730 and #731.