Why another layer of abstraction?

maxschulze commented 1 year ago

This is an attempt to abstract another time (where Scaphandre is already and abstraction of Intel RAPL or SCI is an abstraction of WattTime + RAPL).

Should we really invest time again or rather focus on building the APIs that should really exists:

every virtual machine (KVM, VMware, OpenStack, Xen) should be able to expose its energy use to the virtualized machine as well as auxiliary information on the underlying hardware
every cloud instance should expose the same on a machine readable API that can be accessed from within the operating system
every managed service API should expose its environmental impact though respective meta data or headers
every physical server or network device should provide a transparent Life Cycle Assessment for itself via a machine readable API

Everything we are doing so far are workarounds around the fact the neither cloud providers, nor virtualization systems, nor OEM providers are providing the required APIs and transparency that is needed to determine the footprint of software.

Wouldn’t it make more sense to focus on getting those companies to act rather than building more software / more abstraction / more workarounds?

Maybe sometimes the answer is not “let’s build software to fix this” but “let’s talk to people, let’s engage” to get stuff done.

All it would take to fix is a law that says (along the lines of) “any provider of virtual or physical computing devices must expose machine readable APIs for customers to determine the Life Cycle Impact of the device as well as its energy consumption” - why not focus on this?

plnoel commented 1 year ago

I see some connections as well with the Proposal from Adrian , and it makes sense. Question then is what do we do meanwhile? Nothing or still look for workarounds?

ArneTR commented 1 year ago

@plnoel You raise a fair point that by stopping to build tools / workarounds you also do not contribute to the mission of increasing transparency.

For some people it might not be an option or be in their realm of possiblities to increase pressure on cloud providers to make them publicize real numbers. So working on some workarounds is better than nothing.

Still @maxschulze raises a fair point and from an outside of the GSF point of view I have to admit that I see it as debateable if the GSF should focus energy on another abstraction projection and thus channeling developer and organizational energy into this.

The most valueable point I see from creating tools as workarounds is that it shows developers and cloud providers that more sustainable software could be achieved if only the tooling is easier or the data would be better. I would argue that especially for the former point we already have that. CloudCarbonFootprint or the Teads calculator or also estimation models like the one from the SDIA or the one that we are doing (https://www.green-coding.berlin/projects/cloud-energy/) give you already a very easy tooling and show what is possible. The only thing that is missing is proper data.

If you now put an abstraction layer on top of the already spotty and patchy data you make the energy predictions worse and worse. Especially the granularity here imagined by @jawache (https://github.com/Green-Software-Foundation/carbon-ql/discussions/26) imagines that we have good data for any of these services ... but we don't.

What we see in our measurements and tools we develop is that making estimations about a machine by having rough data about a "similar" machine just gives you bullshit data which is too far from reality. Making assumptions about the configurations that the cloud providers actually run is even another can of worms.

If you glue that now into an abstraction like the carbon-ql I would argue it is rather detremential than helpful because the data is so un-actionable.

jawache commented 1 year ago

This project is now evolving (see https://github.com/Green-Software-Foundation/carbon-ql/discussions/26) into something closer to a software framework to help people model, measure and then monitor the carbon emissions of software applications, end to end (cloud is just one component). It's not going to be an abstraction over other existing models but instead a framework which provides a standard way to call and then glue data from different models together and build a common set of libraries/services/components that work against that standard function signature.

The main pain point/problem we are trying to resolve is the challenges people/orgs have in calculating carbon emissions for a specific application, end to end (so beyond just cloud). This is a big and large enough problem that multiple people from multiple orgs have come together to incubate a volunteer project. That's' the beauty of open source and it's how the foundation works, if enough people care about the same problem - we incubate a project and give space for them to work on solutions.

@maxschulze I think what you are proposing is some form of policy project to encourage more transparent data from cloud providers? I'm not sure what shape that would look like but would be happy to ideate and help see if there is interest in others in getting involved in whatever you're thinking about. It's definitely not an either/or question. It's a common thought.

@ArneTR I agree we have lots of gaps, a project like this will help surface the gaps. If the data is bad/poor then making it easier to surface and compare will drive for better quality data.

rogallp commented 1 year ago

We've developed an API endpoint for this at Climatiq, which allows you to:

estimate use phase carbon emissions for VMs, storage, memory
estimate embodied emissions
returns electricity consumed for each machine.

We'd be keen to collaborate and develop this further. Base methodology is CCF + some Boavizta data.

see here: https://www.climatiq.io/docs#cloud-computing

srini1978 commented 1 year ago

@rogallp Let us talk more on how we can collaborate

Green-Software-Foundation / if

Why another layer of abstraction? #29