lf-energy / tac

LF Energy TAC meeting information and processes
https://tac.lfenergy.org/
Creative Commons Attribution 4.0 International
2 stars 2 forks source link

GridFM #260

Open alexparisot opened 1 month ago

alexparisot commented 1 month ago

Mission Statement

Enable the emergence of foundation models for electrical grids.

Description

This project aims at developing foundation models for electrical grids (GridFMs). Foundation Models (FMs), pre-trained on large data sets and adapted to a broad set of applications, are revolutionizing the field of Artificial Intelligence (AI). Powerful FMs for language and weather have recently emerged, proving that such models can be developed for complex systems. FMs for the electric power grid have similarly been conceptualized to be trained on grid data (cf. Hendrik F. Hamann, “A Perspective on Foundation Models for the Electric Power Grid”, submitted to Joule, Cell Press, July 2024 )

GridFMs have the potential to cope with the increasing complexity and uncertainties stemming from the energy transition by providing a significant speed-up in computation. A key benefit of FMs is their ability for stakeholders to fine-tune a pre-trained model for specific applications, using their own proprietary data in a scalable and economical way. This makes the FM approach ideal for unifying data, technology, and industry expertise towards a common goal.

GridFMs were presented by IBM Research at a workshop in Yorktown Heights in February 2024 and University College London in June 2024 to energy actors from Europe, Canada and the US: utilities, regulators, National Laboratories and the US Department of Energy. The goal of these workshops was to create a collaborative research effort in support of emerging GridFMs. Following these two workshops, the scientific and technological vision of participating members of the GridFM community was further refined and led to structuring the GridFM project presented in this proposal.

We identified creating an FM as a three-step process, as shown in Figure 1:

  1. Pre-training: Develop a general-purpose FM through self-supervised pre-training and a suitable architecture, often based on an autoencoder transformer architecture. In this step the model reconstructs masked parts of the input data.
  2. Fine-tuning: The pre-trained FM is customized for various downstream tasks and applications with minimal labeled data by replacing the existing decoder and attaching a task specific decoder head and training for a few iterations.
  3. Inference: The fine-tuned FM is deployed to allow users to request predictions at low computational cost.
GridFM TAC figure 1

Figure 1 FM life-cycle phases: i) pretraining, ii) fine-tuning, iii) inference

We propose to divide the code of GridFMs in several interrelated modules to support life cycle management:

Module 1: The GridFMLab module will contain GridFM meta-architectures and enable the pretraining of models.

Module 2: The Grid Data Zoo will aggregate publicly available synthetic datasets and provides tools to feed this data to model pre-training. This module would also define a set of data interfaces, provide data pre-processing tools and algorithms that allow utilities and other actors of the energy sector to participate in the training with their own real data.

Module 3: Foundational Grid Simulator module will provide simulation tools to train GridFMs. This module contains several electrical grid network specifications for a variety of topologies and control equipment. In addition, the module provides code for generating profiles of generation, load and interconnections. Technological choices for this simulator should allow a tight integration with ML training loops.

Module 4: GridFM benchmark will provide a code base to monitor the performance of existing GridFMs. There are two potential directions to pursue. 1) orient the benchmark system to provide a ranking and hence visibility on SOTA models. 2) orient the benchmark system to provide performance measurements of models during their lifetime in operations. No decision has been made relative to these two directions.

Module 5: Federated learning for GridFM. A barrier for the training of grid FMs is limited access to a large quantity of diverse synthetic and real electrical data. Energy actors have different policies relative to data sharing: some choose to make some of their data public while others keep their data private for data-sensitivity reasons. Yet, large economic benefits are expected with the application of grid FMs. Thus, federated learning will enable the participation of these many entities for training GridFMs while respecting their data sharing constraints.

Module 6: Lifelong learning for GridFM would update GridFMs and avoids retraining from scratch if data changes or becomes newly available. Solutions to this hard problem would improve the economical benefits of the FM approach even if this module would be part of a long-term vision only.

GridFM TAC figure 2

Figure 2 Potential modules of the GridFM project.

Is this a new project or an existing one?

This is a new project.

Current lead(s)

IBM, Hydro-Québec

Sponsoring organization(s), along with any other key contributing individuals and/or organizations

Potential sponsor/contributors, in alphabetical order: Argonne National Laboratory, DOE, Hydro-Québec, IBM.

Detail any existing community infrastructure, including:

None.

Are there any specific infrastructure needs or requests outside of what is provided normally by LF Energy ? If so please detail them.

No.

Why would this be a good candidate for inclusion in LF Energy?

The GridFM project would complement LFE’s existing portfolio in a coherent manner. For example, there are existing LFE projects dealing with simulation that could provide input data to pre-train a GridFM. Then, GridFM is a novel concept, therefore, it’s not existent at LFE. Finally, GridFMs target applications of interest to reach LFE goals like efficient power flow solving or contingency analysis, to name a few.

How would this benefit from inclusion in LF Energy?

Training GridFMs requires data from numerous electrical networks, in/from different countries, on different continents, with different jurisdiction, and participating entities may have common and conflicting interests. LF Energy would provide a neutral ground, beneficial for the needed collaboration. In that sense, the open-source software model is a good match for GridFM.

Another expected benefit is related to the networking and visibility that would be provided by LFE within the energy sector. Community building is a key ingredient to enable the emergence of GridFMs. IBM already created the GridFM community and hosts monthly meetings with voluntary participants. For example, Hydro-Québec has been an active member of this community. Joining force with LFE would potentially be transformative and benefit the entire community.

Provide a statement on alignment with the mission in the LF Energy charter.

The goals of the project align with LF Energy’s mission to support “open source and/or open standards projects relating to the generation, transmission, distribution and delivery of energy”

What specific need does this project address?

The energy transition, with its widespread electrification and massive deployment of renewable and distributed generation, is the main driver of change of electric grids, affecting both the demand and supply sides of the market, and this is happening fast. Climate change, which leads to more frequent and higher-intensity weather (extreme)events, poses additional challenges to system planners and operators.

In this context, operation, control, and planning of power systems will soon be pushed to their limits. As depicted in Figure 3, new computational methods and approaches are needed, capable of better tackling the challenges presented by increased uncertainty and complexity.

GridFM TAC figure 3

Figure 3 The energy transition, aging infrastructure, cybersecurity challenges, and climate change greatly increase complexity and uncertainty in operating, controlling, and planning the power grid. This is creating a widening gap between existing computational capabilities and the evolving needs of the electric power industry.

Describe how this project impacts the energy industry.

The broader vision of a GridFM is to enable as many energy-utility applications as possible. Even if early GridFMs would be limited to few applications, the superior adaptability of FMs over other AI models through fine-tuning to thousands of different utilities with their specific data and requirements justifies the FM approach. In the above context, fine-tuning can mean different things. A GridFM may be fine-tuned to a set of diverse applications like load forecasting, fault detection and others. Initially, this may be too ambitious and challenging, considering the diversity of downstream applications as shown in Figure 4, each requiring different data, temporal scales, and pre-training strategies. But fine-tuning may also refer to tailoring a GridFM for a given application with user-specific (and potentially private) data to the particular user needs and requirements. For example, just in the United States, there are 3,200 utilities, with their own private and siloed data and grid-specific requirements that could directly profit from such a scheme.

GridFM TAC figure 4

Figure 4 Applications: Examples of power system problems to be solved with GridFMs.

Describe how this project intersects with other LF Energy projects/working groups/special interest groups.

The project is aligned with the LFE AI SIG. OpenSynth, PowSyBl and DynaWo are also related to the proposed training infrastructure.

Who are the potential benefactors of this project?

What other organizations in the world should be interested in this project?

Plan for growing in maturity if accepted within LF Energy

If accepted, the first steps will be:

  1. Implement an initial version GridFM-v0 in module 1 GridFMLab.
  2. GridFM-v0 will be pretrained on a synthetic power flow dataset created in module 2.
  3. Evaluate the performance of GridFM-v0 with module 4 GridFM Benchmark.

In parallel, the further development of industrial partnerships will help develop and test GridFMs for and in real world applications. LFE GridFM project will showcase the potential of the technology to a variety of interested parties.

Project license

Apache v2.0 except for Module 1 which will be under Mozilla Public Licence v2.0 (MPL v2.0)

Is the project's code available now? If so provide a link to the code location.

Initial repositories exist in private Github spaces and can be forked once the LFE project is started.

Does this project have ongoing public (or private) technical meetings?

Monthly technical meetings are being held in the GridFM community. Attendance is on invitation.

Does this project's community venues have a code of conduct? If so, please provide a link to it?

Not yet

Describe the project's leadership team and decision-making process.

No formal structure in place.

Does this project have public governance (more than just one organization)?

The GridFM community is now being structured with a governance subgroup comprising members representing the interests of different stakeholders.

Does this project have a development schedule and/or release schedule?

No.

Does this project have dependencies on other open source projects? Which ones?

Initially: Pytorch, pandapower.

Describe the project's documentation.

Not applicable.

Describe any trademarks associated with the project.

No.

Do you have a project roadmap? If so please attach or provide a link.

The four phases as foreseen for the development of GridFM-v0, and as depicted in Figure 5 are discussed in the following.

GridFM TAC figure 5

Figure 5 GridFM implementation road map with four near-term phases to develop GridFM-v0, a GridFM for power flow related applications. The long-term goal is creating a family of GridFMs for a broader range of grid challenges.

Phase 1: Identify, collect and generate training data.

Training GridFM-v0 requires a large collection of solved power flow problems for different grid topologies, parameters, and different load conditions. In recent years, large power grid datasets have been made available to the community, created by (1) collecting grid topology and measurement data, (2) generating varying synthetic load conditions per grid model and (3) using established solvers to compute a wide range of power flow solutions under different operational conditions. We aim at using the dataset provided by https://github.com/PowerGraph-Datasets which contains power flow solutions under real load conditions on 4 different grid topologies. To complement this dataset, we will additionally solve diverse power flow problems using traditional power flow solvers, with power grid topologies from PGLIB-OPF and standard IEEE benchmarks and with topology and load perturbations. Our model will thus be pre-trained on data from realistic topologies with a broad range of operational conditions. This approach facilitates the model to generalize well across topologies. For model fine-tuning, datasets from https://github.com/PowerGraph-Datasets for system security analyses and https://github.com/AI4OPT for OPF will be used. The latter contains 300,000 solved optimal power flow problems for ten different grid topologies each, including solutions for N-1 contingency analysis, with grid sizes ranging from 14 to 14,000 buses.

Phase 2: Model architecture development.

GridFM TAC figure 6

Figure 6 GridFM-v0 is pre-trained to reconstruct masked power flow data from input graphs.

In the second phase, suitable architectures will be evaluated by pretraining performance analysis, and by adjusting models, training strategies, and loss functions. Figure 6 illustrates the Masked Auto-Encoder (MAE) pre-training task of GridFM-v0. Intuitively, the grid is modeled as a graph, where nodes represent electrical buses with active power, reactive power, voltage magnitude, and voltage angle, (pi, qi, vi, δi), as node variables. We opted for this graph representation of power systems, but others can also be considered. Transmission lines and transformers are modeled as graph edges. We train an autoencoder in a self-supervised manner to reconstruct masked node features from power flow solutions, each corresponding to a specific topology and load condition. This effectively amounts to approximating the power flow solution. During training, the loss function ensures meaningful reconstruction of node variables within the physical constraints. The total loss L = L_SCE + L_powerflow combines the Scaled Cosine Error (SCE), which minimizes the distance between the original and reconstructed graphs, and L_powerflow, which ensures that the power flow equations are satisfied.

Phase 3: Model scaling and validation on industry data

Due to regulatory constraints, it is unlikely or at least challenging to gain full access to sensitive data from operational power grids. To ensure the model’s accuracy and relevance in practical applications, it is therefore imperative that a GridFM be validated on operational data and real grid topologies. To address these challenges, an iterative approach is adopted. Our GridFM-v0 will initially be pre-trained and tested on openly available data. Subsequently, it will be retrained and validated in collaboration with electricity utility partners. The results of these validations will be used to update the architecture, scale, and parameters of the pre-trained model. Once model pre-training yields satisfactory performance after retraining on operational data, the pre-trained model will be made available to the community for developing specific power grid applications. This approach ensures that the GridFM remains performant and aligned with industry needs without exposing nor pre-training the model on proprietary data.

Phase 4: Implement power flow-based use cases.

GridFMs are anticipated to achieve a computational speed-up of 3 to 4 orders of magnitude over traditional power flow solvers, as stated previously. This will benefit various applications:

(1) Complementing Numerical Power Flow Solvers. For applications that require solving power flow frequently, and where speed is valued over accuracy, linear approximations (such as DC power flow) are typically used. For these applications, faster AI-based power flow solvers offer a trade-off between “traditional” DC power flow and AC power flow solutions. We expect minimal fine-tuning of these models, as the task is equivalent to the suggested pre-training reconstructing task.

(2) Contingency Analysis. Contingency analysis is essential for assessing grid resilience. On a grid with 1000 lines, simulating the loss of each individual line requires solving 1,000 distinct power flow problems. However, when considering scenarios that involve the simultaneous loss of any two components, the number of power flow analyses increases dramatically to (1000, 2)= 499,500. By utilizing a power flow solver that is 1000 times faster, operators will be able to perform contingency analyses across a much larger set of scenarios. For example, solving 499,500 scenarios with AI-based methods would take roughly the same time as solving only about 500 scenarios using conventional physics-based solvers. Therefore, a GridFM-v0 will greatly enhance grid analyses and contribute to operating and planning under uncertainty. The power grid is continuously evolving, therefore, the GridFM-v0 must be able to account for different network topologies and different operating conditions of a same grid. This can be accomplished by training the model on grids with topological perturbations, or by additionally training the same model on entirely different grid topologies.

(3) Optimal Power Flow (OPF). The authors of CANOS ( [L. Piloto, S. Liguori, S. Madjiheurem, M. Zgubic, S. Lovett, H. Tomlinson, S. Elster, C. Apps, S. Witherspoon, Canos: A fast and scalable neural ac-opf solver robust to n-1 perturbations](https://arxiv.org/abs/2403.17660 (2024). arXiv:2403.17660) ) trained a model on synthetic data and demonstrated an OPF speedup of 2 to 4 orders of magnitude while maintaining errors of less than 1% in the objective value. Solving the OPF translates into fine-tuning the GridFM for reconstructing the generator power to minimize generation costs. Reconstructed node variables must adhere to bounds (e.g., generator capacity constraints), making this problem challenging. However, GridFM-v0 can be hierarchically integrated into optimization schemes where generator power is iteratively adjusted, solution feasibility is checked, and other variables are determined using the power flow solver.

Are this project's roadmap and meeting minutes public posted?

The project roadmap has been published in a scientific article. The meeting minutes are stored on IBM cloud storage provider and available to the GridFM community.

Does this project have a legal entity and/or registered trademarks?

No

Has this project been announced or promoted in any press?

No

Does this project compete with other open source projects or commercial products?

To our knowledge, there is no equivalent open source project or commercial product.

yarille commented 3 weeks ago

Approved via TAC vote on 10/29.