thegreenwebfoundation / co2.js

An npm module for accessing the green web API, and estimating the carbon emissions from using digital services
Other
385 stars 48 forks source link

The use of data as a proxy in the Sustainable Web Design model #138

Open mgifford opened 1 year ago

mgifford commented 1 year ago

Looking at: https://sustainablewebdesign.org/calculating-digital-emissions/ https://developers.thegreenwebfoundation.org/co2js/explainer/methodologies-for-calculating-website-carbon/#the-sustainable-web-design-model

It is clear that "data transfer as an proxy indicator for total resource usage" or put a different way "kWh/GB as the key metric on which to calculate the carbon footprint".

I am getting pushback on this. Is it more than page weight x kWh/GB?

Is it possible that Google Lighthouse provides a better tool, as load time clearly impacts energy consumption.

Are other elements that could be calculated, such as heavy usage of JS? Static page evaluations are a challenge with pages that are dynamically updated, but not sure how this could be evaluated.

I assume that there is no traceroute information being calculated here. Would be interesting to know how many hops a packet has likely taken, or indeed how many packets were shipped.

Is CPU usage evaluated?

fershad commented 1 year ago

@mgifford I feel the most important thing to get clear here is what CO2.js actually is. That will guide the rest of the conversation about this issue.

When it comes to generating carbon estimates, CO2.js is simply a wrapper around existing accepted models. That is to say, there's no "secret sauce" in CO2.js or no additional functionality that CO2.js adds around these models. That is to say, if a model used data transfer, time on site, and CPU usage as the input for calculating carbon emissions, then that's what CO2.js will have the input. Currently, there's no such model - or at least no such model implemented in CO2.js.

We would love to implement more methodologies into CO2.js to account for different content types, modern practices etc. For that, we (Green Web Foundation) rely on community contributions (issues, initial PRs, guidance) and/or additional funding which would allow us to allocate dedicated time to the CO2.js project.

With that outlined, here are some comments on the questions you've raised above.

Is it more than page weight x kWh/GB?

I think as an industry we are coming to that conclusion, although there's still a lot of research being done to paint a clearer picture.

The W3C Susty Web Working Group also has a Metrics, Analytics and Reporting sub-group that is looking into this. Their analysis, feedback, and recommendations might well feed into revisions to the Sustainable Web Design model, or lead to the creation of a new standard for calculating digital carbon emissions (no pressure team 😅).

Is it possible that Google Lighthouse provides a better tool, as load time clearly impacts energy consumption. Are other elements that could be calculated, such as heavy usage of JS? I assume that there is no traceroute information being calculated here. Is CPU usage evaluated?

All these changes would need to be made at the model/methodology level before they are implemented in CO2.js.

There are other tools that are better placed at present to generate carbon emissions estimates based on energy usage and other factors.

mgifford commented 1 year ago

The Greenframe model looks pretty close to what we were looking for: https://github.com/marmelab/greenframe-cli#which-factors-influence-the-carbon-footprint

the carbon footprint of a web page depends on:

Of which the page size might be a reasonable proxy for most sites. It's just a matter of either demonstating this or highlighting the limitations.

Perhaps something like Greenframe more accurate, but takes a lot more resources to run. There are sometimes good reasons to run less accurate tools.

ldevernay commented 1 year ago

Even if the Greenframe model might be an improvement from the "data => co2" approach, there are still limitations to this, as stated here : https://greenspector.com/en/dom-as-a-metric-for-monitoring-web-sobriety/

Counting elements on a page is not necessarily relevant, since it doesn't take into account the relative impact for each of these.

As stated by @mgifford, CPU should also be taken into account, for instance. And this is indeed one of the objectives for the W3C Susty web group focused on metrics. I agree on the necessity to sometimes use less accurate tools but, in the long run, web professionals should reach a consensus on metrics and methodology.

BTW, I also have some thoughts on limiting ourselves to carbon (since it eludes other impacts and might degrade other indicators such as water consumption). I'm working on an article on this.

mgifford commented 1 year ago

Coming from the accessibility world, I keep coming back to "Progress over Perfection". 

The data model was progress. It was one measure we could build on. 

But like all models, they don't reflect reality. They never will. Improved models will just be a better reflection of reality. The map is not the terrain. 

Something like this might help firefox-devtools/profiler#4599

For Intel based chips, solutions with the Power Gadget could be useful (for Windows & old Macs). 

Ultimately I think the next step is to provide a weighting by mime type. 

In our studies, we're finding JS (and actually sites with large JSON files) to be particularly energy intensive. 

We need to set this up though so that we're not evaluating hundreds of sites, but thousands if not hundreds of thousands.  And we should be doing this probably every 6 months to make sure that as browsers and operating systems change that we have new metrics to apply. 

CO2.js will only ever be an approximate. We also likely don't want to substantially increase the CO2 emitted by using CO2.js by making the algorithm substantially more complicated.

ldevernay commented 1 year ago

Thanks @mgifford, these are truly great insights and perspectives! Let me know if I can be of any help, this is close to some considerations we have in Greenspector.

fershad commented 1 year ago

@mgifford love this chat.

CO2.js will only ever be an approximate. We also likely don't want to substantially increase the CO2 emitted by using CO2.js by making the algorithm substantially more complicated.

I just want to be clear in how we talk about/think about CO2.js

CO2.js is not a methodology. It is not an algorithm or calculation in its own right. CO2.js is a library that we want to bring together different digital carbon estimation methodologies in a package that is easier for developer to adopt in their projects. Right now, CO2.js allows developers to use the Sustainable Web Design model or the OneByte model to calculate emissions. Adding other methodologies is possible, and if you have candidates to consider then please do raise issues to start that conversation. See this one from Chris as an example #141.

As mentioned previously

We would love to implement more methodologies into CO2.js to account for different content types, modern practices etc. For that, we (Green Web Foundation) rely on community contributions (issues, initial PRs, guidance) and/or additional funding which would allow us to allocate dedicated time to the CO2.js project.

soulgalore commented 1 year ago

Hi @mgifford the Mozilla devs has implemented a way to measure power consumption in Browsertime when you run tests on Android phones: https://github.com/sitespeedio/browsertime/blob/ef9b6ce2c916d79a54984751af2d16ef5b16ba02/lib/android/index.js#L465-L468 - I haven't spent any time though looking at the metrics from a web page perspective.

mrchrisadams commented 1 year ago

oh wow, @soulgalore - I had no idea it could do that now!

We did some work with Firefox before to work directly with the power usage figures it started exposing towards the end of last year, largely so the browser itself can self-report some of these numbers. You can see a little more below:

https://blog.nightly.mozilla.org/2023/01/24/new-year-new-updates-to-firefox-these-weeks-in-firefox-issue-130/

I had been wondering if it these can be exposed to Browsertime somehow, but I'm little less confident about how to do that, as it relies on some features in the Firefox profiler.

I'd be happy to discuss how to do this in future though, as directly reported figures instead of modeled numbers is something it would be great to work with if possible.

soulgalore commented 1 year ago

@mrchrisadams ah cool I didn't know that. It's easy to enable the Geckoprofiler log for Browsertime, but not sure if you need some special settings for the work you've done? Today we do not post-process the profiler log, but if you or someone else help me understand the log (how to get the data), it should be trivial for me to add that so Browsertime can get those numbers automatically as long as the profiler is enabled.

mrchrisadams commented 1 year ago

hi Peter!

I think the work we do relies on the energy figures that are exposed by some lower level C++ code that is part of the Geckoprofiler. However, I think the profiler format is slightly different. The docs say as much:

Profile data visualized in the Firefox Profiler is obtained from the Gecko Profiler, a C++ component inside of Gecko. The profiler.firefox.com web app assumes that the Gecko Profiler add-on will activate the Gecko Profiler, gather the data, and then provide it to the client. This document talks about the Gecko profile format, which is distinct from the format that profiler.firefox.com uses. The plan is to migrate the Gecko profile format closer and closer profiler.firefox.com's desired processed profile format.

More below: https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md

I'll see who might know at Mozilla and can chime in, because assuming this can be exposed, you can end up with much more accurate numbers in an automated tools like Sitespeed, WPT and so on.

To my knowledge, you can't easily get to these numbers using other browsers like Edge / Chromium, etc. but I'd be happy to learn if there are options in future.

CarolineClr commented 1 year ago

Thanks for bringing Greenframe.io into the conversation! @fershad @mgifford 😃🙏🏻

We worked hard together with French research teams at CNRS to develop a scientifically sound and robust model. At the same time, since there is no scientific consensus on this topic, we are aware that our model is not perfect (like any other model at the moment😉).

We decided to make our model free and open source to ensure transparency and to allow everyone to adapt it to their needs.

CarolineClr commented 1 year ago

@mgifford We definitely share your view on "Progress over Perfection".👏 Instead of waiting for the "perfect" model with 100% correct metics, we should start implementing green coding practices now and work with what we've got up to date.

mgifford commented 1 year ago

So from my understanding we need to be weighting JavaScript/JSON more and maybe images/video less. All subject to change, but I think that there's been a lot of performance work on big images/videos. You can't do that as easily for cutom JavaScript.

It's more nuanced than that, but @CarolineClr do you see that as being the next step?

fzaninotto commented 1 year ago

Hi, GreenFrame sponsor here.

Even if the Greenframe model might be an improvement from the "data => co2" approach, there are still limitations to this, as stated here : https://greenspector.com/en/dom-as-a-metric-for-monitoring-web-sobriety/

Counting elements on a page is not necessarily relevant, since it doesn't take into account the relative impact for each of these.

GreenFrame doesn't count DOM elements in a page. It counts the CPU, memory, network and disk I/O consumed by a browser when navigating a website. GreenFrame doesn't work at the browser level, but at the system level. So it doesn't depend on unproven correlations like DOM elements => CO2 emissions, but only on proven correlations (see the list of scientific papers used in the GreenFrame model).

fershad commented 1 year ago

@CarolineClr and @fzaninotto thank you both for your input.

We would like CO2.js to grow to include more estimation models if possible. If you would like us to explore if Greenframe can fit into this, then please do raise an issue where that can be tracked. https://github.com/thegreenwebfoundation/co2.js/issues/new/choose

fzaninotto commented 1 year ago

If you would like us to explore if Greenframe can fit into this, then please do raise an issue where that can be tracked.

Done: https://github.com/thegreenwebfoundation/co2.js/issues/145

ceddlyburge commented 9 months ago

I've got good information by running the server and browser (via cypress) in the Green Metrics Tool. They run everything on actual hardware and measure the energy, which is amazing, if still not absolutely perfect (no account for network distance travelled, PUE, energy proportionality etc).

If you already have cypress / playwright tests its really easy to do, its basically the same as running everything in docker compose.

thibaudcolas commented 5 months ago

I thought I’d flag the HTTP Archive’s work on reviewing the sustainability of the web at large is due to resume soon, with the Sustainability chapter in the 2024 Web Almanac. It’s a solid opportunity to trial different ways to measure website emissions at scale.

The HTTP Archive capture data from millions of pages, they’re looking for analysts as part of their Web Almanac report, there’s an opportunity for analysts to capture additional custom metrics to support their reporting.

I’m not sure if it’s possible to capture power measurements with their setup, but certainly worth exploring? And even failing that, if there’s any other browser/page metric worth capturing, this is the right project to do so as they have the infrastructure to run this over millions of sites on a monthly basis, and make the data easily available (BigQuery) for future research.

fershad commented 8 hours ago

@mgifford with the updated Sustainable Web Design Model being published, and the explainer page addressing this concern, are you okay for this issue to be closed here in this repository? I think other places like the Sustainable Web Design Model contact form, or W3C Mailing list might be better places for these discussions as they relate to specific models.