IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
238 stars 120 forks source link

Split core from full package? #748

Open znicholls opened 1 year ago

znicholls commented 1 year ago

Pyam is great, but it is super restrictive with its supported versions of dependencies. For users, this is great. For anyone who wants to use it as a dependency of another package, it's really painful because upgrading pyam causes all sorts of other things to update which results in weird and wonderful bugs (see e.g. our fun in https://github.com/iiasa/climate-assessment/pull/36).

One way out of this would be to split out a pyam-core package (call it whatever) that has the key bits of functionality, without all the package pinning (and logging mangling). Pyam would continue to exist and support direct users. However, the split would allow pyam-core to be used as a dependency without the dependency headaches that the current approach leads to.

@phackstock would be interested in your thoughts on this given the pain you've gone through with climate-assessment. cc @lewisjared and @jkikstra

danielhuppmann commented 1 year ago

Three questions:

znicholls commented 1 year ago

which dependencies specifically are overly restrictive?

It varies over time. Of the current list, matplotlib < 3.7.1 and numpy < 1.24 could bite if you're in the wrong project. Python < 3.12 is also painful if you want to explore in the latest Python version. Previously pint and pandas have been pinned which has caused headaches downstream.

which parts would be part of the core vs. the full package?

As a start: core: IamDataFrame and everything it depends on. Full package (including logging hacks etc.): everything else

who would have to shoulder the additional maintenance effort after the split?

The pyam mainteners. That's obviously a downside but I think the upsides are:

  1. You reduce the headaches of downstream users (who often either a) come back to you asking for help or fixes or new releases so they can just try this other thing or whatever or b) are you e.g. Philip's pain over the last few weeks)
  2. You allow greater extensibility of pyam for downstream users because they just get the key features and algorithms without the stuff they then have to hack around themselves (e.g. package pinning and logging hacks)
  3. Maintenance of each package becomes simpler because their scopes are more obvious (if you maintain the core, you don't need to worry about specific user helping features like logging hacking, if you maintain the user package, you don't need to worry about the core because it 'just works', you can also tell users 'the core is tested like this but we deliberately don't pin hence installation is not guaranteed, unlike the user package where we pin everything to avoid headaches for users' so hopefully fewer installation explosions for you to help people through)