lf-energy / tac

LF Energy TAC meeting information and processes
https://wiki.lfenergy.org/display/HOME/Technical+Advisory+Council
Creative Commons Attribution 4.0 International
2 stars 1 forks source link

OpenSynth (Synthetic Energy Data) #68

Open anguschadney opened 8 months ago

anguschadney commented 8 months ago

Mission Statement

Accelerating global energy systems research with open access to synthetic data

Description

Access to raw smart meter data is essential for energy research, to enable a rapid and successful transition to a fully renewable energy system. Very little of the data that exists is available for research purposes, primarily to protect consumer privacy. Synthetic data can solve this problem.

Centre for Net Zero has created a tool called Faraday that can generate synthetic half-hourly consumption data (i.e. smart meter data) for various household archetypes (e.g. LCT ownership, season etc.) Faraday is currently in it’s Alpha phase and is being used by some commercial companies (ARUP) and research organisations (University of Birmingham, University of Oxford) for their research.

Smart meter data is incredibly valuable for research organisations and utility companies that want to understand and predict consumption at a household level and build models and data products that leverage this. Our alpha product currently has 57 engaged users over 14 organisations, giving a clear signal about the utility of the data.

Previously, only aggregated data was available at scale due to privacy issues. Synthetic data gets around this issue by being completely artificially generated, while critically sharing the same statistical attributes as the underlying real-world data.

Our vision is to accelerate our effort to liberate smart meter data for research and innovation globally by creating an open source community of synthetic smart meter data contributors and users. We want this community to empower all holders of raw smart meter data around the world to be able to generate and share synthetic data.

Our proposed goals for the community are:

  1. To define what good synthetic data looks like, and to develop the tools and algorithms to measure this
  2. To contribute to a code repository of algorithms that is vetted against the procedures developed in 1)
  3. To contribute to a data repository of synthetic smart meter data that is generated using the algorithms in 2)

The third goal is our ultimate aim, but 1 and 2 are necessary for its success.

Is this a new project or an existing one?

Existing

Current lead(s)

Sheng Chai, Gus Chadney, Gareth Jones

Sponsoring organization(s), along with any other key contributing individuals and/or organizations

Centre for Net Zero, Octopus Energy

Detail any existing community infrastructure, including:

No infrastructure for the open-source project yet.

Infrastructure for internal Faraday tooling:

Are there any specific infrastructure needs or requests outside of what is provided normally by LF Energy ? If so please detail them.

No

Why would this be a good candidate for inclusion in LF Energy?

Privacy concerns are incredibly hard to overcome without losing a lot of the granular features that are necessary to build accurate models. We think that synthetic data is the best way to generate household level data at scale in order to build the models that will allow us to make the future energy system a reality.

Synthetic data could be used to develop and test much of the technology that will be built to support LF Energy’s mission to facilitate rapid decarbonisation of the global grid.

By open sourcing the algorithms to generate and validate the synthetic data, and building a community to drive adoption and ideate usage applications, we will be accelerating the technological advancements that will be critical to the transition.

How would this benefit from inclusion in LF Energy?

We believe that this project will benefit from LF Energies expertise in creating and managing communities to bring together the generators and consumers of synthetic data.

Additionally, LF Energy will help us build the management framework to ensure our quality controls and governance is high enough to instil confidence in the data and applications.

Finally, LF Energy will help us navigate and implement the correct licenses for usage of the software and data.

Provide a statement on alignment with the mission in the LF Energy charter.

This project will develop the technology and standards of synthetic data, which will be used to improve knowledge and build applications in all aspects of the energy system. This aligns with LF Energy’s mission to support “open source and/or open standards projects relating to the generation, transmission, distribution and delivery of energy”

What specific need does this project address?

This project seeks to address the complete lack of publicly available synthetic energy data for use in research and to drive innovation in the energy technology sector.

Describe how this project impacts the energy industry.

The energy industry will benefit from this community’s mission to develop reliable, accurate and representative synthetic data to use in-house to develop their internal technology and research. They will also have the tools to generate their own synthetic data, adding it to the open source pool where appropriate.

Describe how this project intersects with other LF Energy projects/working groups/special interest groups.

This project is closely aligned with the SIG "AI and Energy Systems". It is an application of AI and energy data that will be able to support other AI projects related to energy systems. It will be a critical part of energy systems models, especially the bottom-up category.

There is also close alignment with the initiative by Georgia Tech in Atlanta to set up a real transmission grid model with synthetic hourly consumption data. This project is generating data at the bus level to look at optimisation of topology on the network. Whilst this initiative is focused on a single use case and ours covers synthetic data more generally, there is scope for this initiative to be part of our wider group to advance the knowledge more generally.

Who are the potential benefactors of this project?

Directly:

Indirectly:

What other organizations in the world should be interested in this project?

Plan for growing in maturity if accepted within LF Energy

If accepted, we would seek to implement the following:

  1. Develop an open source data repository of synthetic data that others can contribute to
  2. Make our synthetic data algorithm repository open source by implementing LF Energy best practises
  3. Establish and grow the core community by reaching out to potential partners, leveraging LF Energy’s connections and introductions

Project license

N/A

Is the project's code available now? If so provide a link to the code location.

There’s nothing for this open-source project yet. However we do have some initial research and project done for our Faraday synthetic model in a private Github repository.

Does this project have ongoing public (or private) technical meetings?

Yes - private

Does this project's community venues have a code of conduct? If so, please provide a link to it?

No

Describe the project's leadership team and decision-making process.

Does this project have public governance (more than just one organization)?

No

Does this project have a development schedule and/or release schedule?

Yes

Does this project have dependencies on other open source projects? Which ones?

No but we use the following open-source tools: 1) Pytorch 2) Sklearn + other standard data science libraries 3) existing public smart meter dataset (Low Carbon London)

Describe the project's documentation.

Readme in GitHub repository. No other documentation tools built yet.

Describe any trademarks associated with the project.

We use the unregistered word mark “Faraday” in relation to our synthetic data models, algorithms, and tools. Centre for Net Zero will continue to use this mark. We have not yet decided whether this mark will be used more widely as part of this project

Do you have a project roadmap? If so please attach or provide a link.

We do not have any road map for this open-source project yet. But we have an internal roadmap for our Faraday tool.

Are this project's roadmap and meeting minutes public posted?

No

Does this project have a legal entity and/or registered trademarks?

This project is owned by Centre for Net Zero limited (a private limited company registered in England and Wales (13389273), part of the Octopus Energy Group of companies)

Has this project been announced or promoted in any press?

‘Faraday’ - the smart meter generative model we’ve built, has been announced on CNZ’s website and tech blog.

But this open-source community of synthetic smart meter data contributors/ users project has not been announced anywhere.

Does this project compete with other open source projects or commercial products?

No

yarille commented 5 months ago

Project approved by TAC vote on 1/9