Green-Software-Foundation / scer

Software Carbon Efficiency Rating
Other
26 stars 4 forks source link

2024.06.26 #70

Open seanmcilroy29 opened 1 month ago

seanmcilroy29 commented 1 month ago

2024.06.26 Agenda/Minutes


Time: Bi-weekly @ 1700 (BST) - See the time in your timezone

Antitrust Policy

Joint Development Foundation meetings may involve participation by industry competitors, and the Joint Development Foundation intends to conduct all of its activities in accordance with applicable antitrust and competition laws. It is, therefore, extremely important that attendees adhere to meeting agendas and be aware of and not participate in any activities that are prohibited under applicable US state, federal or foreign antitrust and competition laws.

If you have questions about these matters, please contact your company counsel or counsel to the Joint Development Foundation, DLA Piper.

Recordings

WG agreed to record all Meetings. This meeting recording will be available until the next scheduled meeting. Meeting recording link

Roll Call

Please add 'Attended' to this issue during the meeting to denote attendance.

Any untracked attendees will be added by the GSF team below:

Agenda

Introductions

Standard update submissions

Project Dashboard

Future meeting Agenda submissions

Previous meeting AI's

Action Items

Next Meeting

Adjourn

Meeting Action Items / Standing Agenda / Future Agenda submissions

chrisxie-fw commented 1 week ago

attended

metacosm commented 1 week ago

attended

adityamanglik commented 1 week ago

attended

seanmcilroy29 commented 1 week ago

MoM

Chris opens the meeting at 1700 (BST)

Rating systems for AI models, including concerns about lack of transparency and standardization. Chris discusses issues with the rating system and advocates for transparency and disclosure. The group discussed the lack of disclosure in rating systems, using examples from Nutriscore and Hugging Face. Chris emphasizes the importance of a standardized framework for rating AI language models.

Standardizing AI model benchmarking process. Chris outlines four steps for evaluating large language models, including software categorization and benchmarking. Tamar and the group discuss the number of tokens per kilowatt hour in the context of AI model benchmarking. Chris clarifies that the standard framework does not define algorithms but provides a four-step framework.

Defining and benchmarking large language models. Chris discusses the importance of defining a "profile" for language models, including model size, application type, and performance metrics. Chris discusses the importance of defining base class for publicization readings of large language models. Chris discusses using a base software framework for AI models, with music streaming and healthcare examples.

Standardizing benchmarking for AI models to compare their efficiency. Tamar and group 1 discuss the challenge of defining categories and benchmarks for software evaluation. Chris proposes a framework for standardizing the evaluation process, with four common steps across all software. The group discussed how to use a shared standard process to measure green request emissions. Chris explains how to assess the efficiency of AI models using a rating system based on requests.

Standardizing carbon efficiency ratings for AI models. Chris outlines Salesforce uses a community-driven standard, Sheer, to assess the carbon footprint of their services for customers. The group discussed concerns about the rating system's ability to compare companies offering large language models based on carbon efficiency. Chris: The company aims to create industry-wide standards for language models. The group discussed the goal of having standardized tests for language models across companies. Chris proposes defining a carbon-efficient rating for large language models open to other software categories. The member asks about focusing on LLM standards and potentially expanding to other software categories.

AI standards and frameworks for software development. Tamar and the group discussed the complexity of AI and its unique challenges compared to other software products. The group explores defining categories for AI, including depth-first and breadth-first approaches, and the need for experts in specific areas to define standards. Chris discusses default implementations and disclosure in AI models. Tamar shares updates on the green AI committee and its intersection with reinventions.

Action Items