[Project] Leveraging Large Language Models for Emissions Estimation

srini1978 commented 1 month ago

Working Title: Leveraging Large Language Models for Emissions Estimation

Related issues or discussions:

https://github.com/Green-Software-Foundation/greenAI/issues/3 https://github.com/Green-Software-Foundation/gaic/issues/4

Tagline: A very short description of the project MAX 8 WORDS

Abstract: A summary of the project MAX 50 WORDS In today’s world, emissions from software are typically calculated after its deployment. Tools like Cloud carbon footprint and Impact framework are used to calculate these emissions. However, design choices made during the Requirements gathering/software design phases greatly impact carbon outcomes . These choices are made upstream in the software development lifecycle much before lines of code are written. Such design choices are available today as design specification documents and architecture artifacts. The project suggests utilizing Large Language Models (LLMs) to proactively predict emissions from these design choices thus enabling greener software design.

Quote: Include made up quote from a business leader.

Audience: System designers and architects who design software systems are the primary audience. While doing so, several technical design decisions are considered along with their alternate design options and the pros/cons of each of these choices are evaluated. By empowering them with carbon emission considerations for their design choices, they are equipped to make informed decisions.

ToC: How will this project support one or more of our ToC pillars e.g. It will improve tooling by “xxxxx”.

Governance:

[ ] Community
[x] Open Source
[ ] Policy
[ ] Standards

Operating Model: Will this project operate based on:

[ ] Consensus - Goal is everyone agree to every change so we are speaking with one voice when the deliverable is released.
[x] Maintainers - The Project Leads listen to feedback and incorporate it back into the project if they see fit.

Problem: 1) Today no explicit guidance is available to evaluate the carbon impact of software design options. This is a missed opportunity since design is perhaps the most critical stage when the overall operational carbon emissions will be influenced by the design choices that the architects make. We do have Green Software patterns and principles but it is mostly referred to manually.
2) Once the system is operational, it is rarely possible to revert back to the design and make changes and also cost of such changes are high.

Solution: 1) Large Language Models could be leveraged to pull intelligence from multiple sources of information. 2) The models can be trained on a vast amount of design-related data and can understand and generate text relevant to design concepts, specifications, and requirements. 3) The models could also be connected to green software patterns and practices repository to infer green system design practices

Potential work : Model identification, Model training, Specifications for LLM usage, Integration with SCI, Integration with Impact framework

Closure: How do we know that the project succeeded? This has to be measurable if possible. Make references to successor projects, if any.

FAQ: Add anything here that doesn't fit into the above sections. It can be blank, to begin with; as questions are asked and points clarified use this section to document those clarifications.

srini1978 commented 1 month ago

@Sophietn @navveenb FYI

jawache commented 2 weeks ago

We had a chat about this in todays OSWG call and discussed this perhaps being structured as a paper which describes HOW to create such a model. That kind of foundational work can then perhaps inform others in this space and be implemented.

srini1978 commented 2 weeks ago

@navveenb FYI . Let us discuss this and include folks who will be interested in contributing to this paper

Green-Software-Foundation / opensource-wg

[Project] Leveraging Large Language Models for Emissions Estimation #126