Green-Software-Foundation / hack

Carbon Hack 24 - The annual hackathon from the Green Software Foundation
https://grnsft.org/hack/github
14 stars 1 forks source link

Synapse-Carbon-Extractor #87

Open krishnabnv opened 3 months ago

krishnabnv commented 3 months ago

Prize category

Best Plugin

Overview

The goal of the Synapse-carbon-extractor project is to integrate synapse utilization metrics into Impact framework from Azure Synapse Spark clusters, SQL databases, and data processing pipelines to calculate carbon intensity metrics. By measuring the energy consumption associated with data processing tasks, the project aims to provide insights into the environmental impact of cloud-based data processing and empower organizations to optimize their workflows for sustainability.

Questions to be answered

No response

Have you got a project team yet?

Yes and we aren't recruiting

Project team

No response

Terms of Participation

Kiran-G1 commented 2 months ago

Azure Synapse Carbon Extractor

Summary

Our project endeavours to quantify the carbon emissions generated by diverse big data processing workloads executed on the Azure Synapse Analytics platform. This solution efficiently retrieves infrastructure usage metrics from Synapse and utilizes them to compute carbon emissions accurately. Notably, our solution seamlessly integrates into the Green Software Foundation’s (GSF) Impact Framework as a versatile plugin. This integration empowers users to calculate the total energy consumed by the software system (E), thereby facilitating and bolstering sustainability endeavours.

Problems

Not being able to measure the carbon footprint of big data workloads results in several challenges. Firstly, it leads to inefficiencies in resource allocation and energy usage, increasing operational costs. Secondly, organizations may struggle to meet sustainability targets without clear metrics, risking reputational damage and loss of stakeholder trust. Additionally, the lack of awareness regarding environmental impact hampers proactive mitigation efforts, exacerbating environmental harm. Overall, the absence of carbon footprint measurement for big data workloads inhibits cost savings, sustainability efforts, regulatory adherence, and environmental responsibility, posing significant risks to organizations. There are notable gaps in existing approaches, such as the absence of tools to collect Synapse usage metrics and integrate it into comprehensive assessments of energy consumption for big data workloads. This results in manual and disjointed processes for gathering and analyzing Synapse metrics, which hampers organizations' ability to make data-driven decisions for sustainability.

Application

Our solution seamlessly integrates Synapse metrics from Azure Log Analytics into the Impact Framework, facilitating the efficient quantification of environmental impact for software systems. The SynapseCarbonExtractor class serves as a pivotal component, streamlining the querying of Synapse metrics and their incorporation into energy consumption calculations within the Impact Framework. Leveraging Azure SDKs ensures optimal query performance and scalability, enhancing the overall efficiency of our solution.

Through automating data collection processes, our solution offers centralized monitoring and analysis capabilities, empowering users with actionable insights into the environmental footprint of software systems. This innovative approach fosters sustainable development practices within the Green Software Foundation ecosystem, aligning with industry standards and best practices.

Recognizing the critical need to quantify and mitigate the environmental impact of software systems, our project addresses challenges exacerbated by the absence of standardized methodologies for measuring energy usage. By bridging these gaps and offering a streamlined solution, we provide organizations with the tools and knowledge to make informed decisions towards reducing environmental impact and promoting sustainable development practices.

Prize category

Best Plugin

Judging criteria

Overall Impact: Our project significantly contributes to sustainability by integrating Synapse metrics into the Impact Framework, enabling precise measurement of software system energy consumption. This promotes informed decisions to reduce environmental footprint, potentially fostering widespread adoption of sustainable development practices.

Opportunity: Our solution extends the Impact Framework's utility by seamlessly integrating Azure Log Analytics workspace, which in turn could be integrated with various other cloud platforms, thus broadening the applicability of IF.

Modularity: Our solution strictly adheres to the modular architecture philosophy by focusing solely on integrating Synapse metrics from Log Analytics. Its design ensures compatibility with other model plugins, enhancing interoperability and scalability within the ecosystem. This approach underscores our commitment to the Unix philosophy, promoting a cohesive environment for measuring and mitigating software system environmental impact.

Video

https://www.youtube.com/watch?v=UiWqQbWCQ8E

Artifacts

if-unofficial-plugins/src/lib/synapse_carbon_extractor/synapsecarbonextractor.ts at main · Kiran-G1/if-unofficial-plugins (github.com)

Usage

if-unofficial-plugins/src/lib/synapse_carbon_extractor/readme.md at main · Kiran-G1/if-unofficial-plugins (github.com)

Process

We developed our solution by initially identifying the need for quantifying software system energy consumption within the Green Software Foundation framework. Leveraging tools like Azure Log Analytics for Synapse metric storage and the Impact Framework for energy consumption calculations, we followed an agile approach, iterating on the solution based on feedback. Collaboration among team members with expertise in Azure services, software development, and sustainability principles was crucial. Our timeline consisted of iterative cycles, regularly assessing progress, and adjusting priorities. Data points included Synapse metrics from Azure Log Analytics and energy consumption calculations within the Impact Framework. Overall, our development process was collaborative, iterative, and focused on efficiently meeting project objectives.

Inspiration

Our solution is driven by the imperative to raise developers' awareness regarding the carbon emissions generated by their Spark jobs, thereby encouraging the creation of more efficient and sustainable data pipelines.

Aligned with the Foundation's commitment to advancing sustainable software development practices, we have engineered a solution to seamlessly integrate Synapse metrics with the Impact Framework. Our overarching objective is to equip organizations with the means to accurately measure and mitigate the environmental footprint of their software systems, thereby cultivating a culture of environmental responsibility within the software industry.

Challenges

Identifying the relevance of various Synapse usage metrics has become slightly challenging and we had to reach out to the Synapse product team for clarifications.

Accomplishments

We successfully integrated Synapse metrics into the Impact Framework, streamlining the process of quantifying and mitigating software system energy consumption for the Green Software Foundation. Our solution follows a modular and scalable approach, aligning with the Unix philosophy for enhanced interoperability.

Learnings

Our journey enhanced expertise in Azure services, Synapse internal infrastructure usage, sustainability practices, and agile methodologies. Collaboration skills were honed through interdisciplinary teamwork, emphasizing adaptability to evolving requirements.

What's next?

The next step involves standardizing and deploying the plugin within the development environment of the data platform. This ensures developers are aware of the carbon footprint associated with their data pipelines. By integrating the plugin into the development workflow, developers gain visibility into the environmental impact of their code. This fosters a culture of sustainability and empowers developers to make informed decisions to optimize energy consumption during pipeline development. Additionally, feedback mechanisms can be established to continually improve the plugin's effectiveness and usability, promoting long-term sustainability practices within the organization.