Green-Software-Foundation / hack

Carbon Hack 24 - The annual hackathon from the Green Software Foundation
https://grnsft.org/hack/github
15 stars 2 forks source link

Parameter Expansion for Testing Varying Assumptions #129

Open josh-swerdlow opened 6 months ago

josh-swerdlow commented 6 months ago

Prize category

Best Plugin

Overview

Our project will expand the parameter space of a given pipeline for different assumptions or plugins; thus, allowing users to experiment with or optimize for certain values or plugins without creating and managing multiple manifest files.

Problem Statement:

Currently, testing different assumptions (or plugins) in a manifest file involves executing separate versions, saving results, and manually comparing them. This process is manual, error-prone, and inefficient.

Proposed Solution:

Our idea is to enable users to include multiple contrasting assumptions in one manifest file, generating a consolidated output for comparison. This would eliminate the need for repetitive testing and simplify the analysis of varying assumptions. We intend to do this by creating a “branch” plugin that copies all of the values of the current manifest's observations, but alters a specified assumption and then continues processing the new manifest with N copies of the original observations for each assumption change.

Our solution would play natively with the IF and only add some intelligence to do a specific set of copy and edit operations on the observations. Nothing else would change from a user's pipeline.

Example Scenario:

Imagine a user wanting to estimate grid carbon intensity for their data center's region. With this plugin, they could specify three estimation methods at once: a fixed value representing an average on the high-end estimate, a fixed value representing an average on the low-end estimate, and estimations from the WattTime plugin. Each method of estimation would produce its own result set, revealing the impact of that assumption on the final outcome.

For instance, if the high-end and low-end estimates differ by an order of magnitude or more yet yield very similar results for total emissions, one might be able to infer that this particular assumption isn’t the most critical parameter in the overall calculation. If – on the other hand – the high-end and low-end estimates only differ by a few grams of CO2 per kWh but lead to vastly different results for total emissions, one might be able to infer that the accuracy of the final result depends highly on the accuracy of this particular assumption, and therefore extra care should be taken to use the best estimate available for carbon intensity.

Additional Benefits:

By allowing users to "branch" (i.e. compute multiple different scenarios at once) within a single manifest, our plugin would enhance the IF's versatility and efficiency. Users would be able to explore diverse assumptions and assess their impact on software environmental metrics without extensive manual effort.

This is not limited to observations parameters, but could also be used to alter which plugins are used at a specific step to test different calculation methods at the same time.

Questions to be answered

  1. Aside from manual operations, is this already possible?

  2. Are there other scenarios where one could see dynamic ‘branching’ be helpful?

  3. How extensive has the IF been tested for scale? If we allow a user to branch multiple times, it increases the dimension of the observations by the size of the branches created and could quickly scale too big.

  4. After a plugin processes the specified observations from a manifest, it writes new observations that are available to the next plugin.

    • What is the extent of the manifest’s fields that are able to be read and written between plugins?
    • Specifically, could you change a given child's pipeline of plugins on-the-fly? (assume that every plugin required is already initialized).

Have you got a project team yet?

Yes and we aren't recruiting

Project team

atwoosnam josh-swerdlow

Terms of Participation

andrew-woosnam commented 6 months ago

Welcoming feedback if anyone has any!

russelltrow commented 6 months ago

Hi @josh-swerdlow please don't forget to register your project: https://hack.greensoftware.foundation/register/

This provides you direct access to the Impact Framework team for your questions and also benefits from our community partners (Microsoft & Electricity Maps).

You must register your project before you can submit your solution for judging.

jmcook1186 commented 6 months ago

Aside from manual operations, is this already possible? No, this is not something we currently support

Are there other scenarios where one could see dynamic ‘branching’ be helpful? This could be useful for scenario testing and forecasting and therefore decision making.

How extensive has the IF been tested for scale? If we allow a user to branch multiple times, it increases the dimension of the observations by the size of the branches created and could quickly scale too big.

we have orgs using very large manifests (>hundreds of thousands of lines) without any issues, but we have not explored branching models like the one you are suggesting. The bottlenecks are generally in manifests that use plugins that hit third party APIs.

After a plugin processes the specified observations from a manifest, it writes new observations that are available to the next plugin. What is the extent of the manifest’s fields that are able to be read and written between plugins? Specifically, could you change a given child's pipeline of plugins on-the-fly? (assume that every plugin required is already initialized).

Every plugin has access to the full outputs array, and the plugins execute in serial, meaning if one plugin modifies or adds a value, all the plugins that follow it in the pipeline have access to the new/modified value.

josh-swerdlow commented 6 months ago

Hi @josh-swerdlow please don't forget to register your project: https://hack.greensoftware.foundation/register/

This provides you direct access to the Impact Framework team for your questions and also benefits from our community partners (Microsoft & Electricity Maps).

You must register your project before you can submit your solution for judging.

Done. Thank you for flagging!

josh-swerdlow commented 6 months ago

Summary (100 words)

The Branch Plugin automates the comparison of different assumptions within a single manifest file, eliminating repetitive manual work. By creating multiple “branches” that alter specific assumptions, it seamlessly integrates with the Impact Framework without affecting transparency, verifiability, flexibility, modularity, neutrality while simulataneously enhancing efficiency and versatility. Users can explore various scenarios and their environmental impact effortlessly, with a single run of ie.

Problems (200 words)

Describe the problems the solution addresses

Exploring scenarios with different parameters would require the user to either manually copy and paste observations until all scenarios are listed as inputs OR write a script to do the above.

Option a is error prone and nonsensical for large-scale manifests or a future where manifests are streamed from an exporter for semi-live analysis or actions.

Option b does not let the user alter parameters during the pipeline. One would need to execute their pipeline up to the parameter they want to change, then perform option a, and finally with a new manifest continue running the rest of the pipeline. For the same reasons, this isn’t responsible.

Neither option is particularly auditable by another user or versatile to the developer.

Application (200 words)

Describe what the solution actually does

Our solution lets the user explore a larger space of parameters without fundamentally changing anything with how the pipeline would behave on a smaller space of parameters. Once the branch plugin is executed in the pipeline, we duplicate all the inputs, branch for every possible parameter specified in the component config, and finally regroup the observations to organize the branches created. Subsequently any plugin can be run on the new set of observations that include multiple user created scenarios. Since everything is done in the same pipeline, exporter and future graph/monitoring plugins will be able to export or display each scenario in a single file/unified pane of glass.

Prize category (- words)

Best Plugin

Judging criteria (200 words)

Explain how what you built meets the judging criteria for your prize category

If we think of those that might use this they are either trying to 'reduce carbon to X' find out 'where can we reduce emissions' most. Our plugin allows both groups to explore what tweaks they can do within their IF pipelines to either get to the bottom line or explore the biggest laggards in their environments. For this to occur, the rest of the IF needs to be fleshed out to support as many cloud services as possible and export as easily as possible. Our plugin expands how users can explore assumptions without touching any specific import or export logic. Based on the plethora of submissions to increase cloud-services that have importers and UI/readability/graphical outputs, we think this is the plugin that hits the sweet spot and will 10x developer productivity while using the IF.

Depending on the pipelines that support measurement of the software ecosystem and environments, this plugin could slam the door wide open and allow users to easily and errorlessly see how their observations would differ in another cloud, another region, or another platform.

Our plugin at its core does a fancy duplication of the input space. It is very simple, but allows for so much while not affecting the underlying pipeline.

Note: We suggest the user use the group-by plugin in the pipeline for the branched parameter to ensure this doesn’t cause issues for aggregation.

Video (- words)

if-branch-plugin.mp4

Artefacts (- words)

https://github.com/atwoosnam/if-branch-plugin

Usage (- words)

Link to usage instructions if applicable https://github.com/atwoosnam/if-branch-plugin

Process (150 words)

Describe how you developed the solution

After ideation and deciding on this idea, we began to architect how it would behave. We had a number of features that were not able to be included in this version and settled on the MVP presented here. Included in this are the ability to branch ones pipeline or include function hooks instead of numeric, string, etc. static values as parameters to branch on.

Inspiration (150 words)

Tell us what inspired you to develop the solution

Andrew and I brainstormed a number of ideas after playing with the IF for a period of time. Ultimately, this idea was brought up and even after discussing all other options this plugin felt like it fundamentally added important capabilities to the IF. I (Josh) have a background in physics research and parameter optimization or searching parameter spaces without needing to write your own scripts to re-run the code and aggregate the results yourself is massively helpful in the scientific exploration process. Many other ideas we also were interested in were submitted and for the intent of making the IF as great as possible we did our best to find a unique idea. We were not concerned with prizes categories, but just the problems we found while playing with the IF and what we would like to see.

Challenges (150 words)

Share the challenges you ran into

The IF was not the easiest to get running initially, but once our environments let us get working the biggest challenge was probably integration testing. We ran into some issues with our internal group-by logic and had to bypass it since it didn’t play well with every plugin. The workaround is to externally add the group by plugin when necessary for your pipeline. Additionally, it was difficult coming up with specific use cases. The branch plugin is really simple, but thinking of all the ways one could try to use it and attempting to ensure it behaved nicely was difficult. We plan to continue developing it until it is stable and thoroughly vetted enough to be placed in the if-unofficial repo.

Accomplishments (150 words)

Share what you are most proud of

I’ve lurked on GSF git and message boards for quite a while and I’m really proud that I finally posted on some of the issues and began engaging with the community. This hackathon pushed me over the edge and my collaborator helped keep me committed. Both of us are really happy we finally played with the IF and learned what it does and the extend for which it can be used now and in the future.

Learnings (150 words)

Share what you learned while hacking

I (Josh) am most happy that I got exposed to js/ts and the IF as a tool. There are a number of other applications I intend to flesh out as personal projects and this gives a great jumping off point. Both of us agree that learning to use the IF was a great takeaway, but even more so is the potential impact and capacity for growth.

Finally, the amount of effort it has and will continue to take from individuals and organizations to get this to every environment, operation, developer type of tool it can be will take a lot more.

What's next? (200 words)

How will your solution contribute long term to the Impact Framework eco-system

We believe that our plugin will let scientists and engineers explore their pipelines much more easily.

We want to solidify the work we’ve already done by getting it to work well in many different scenarios without any external workarounds and fully flesh out the extent of the possible use cases with more complicated examples and demos that visualize the power of branching.

We also hope that we can come back for an upgraded version that let’s users not just vary parameters to a discrete value, but actually use a generator function that could create a set of discrete values based on one or more other parameters.

josh-swerdlow commented 5 months ago

@jawache @russelltrow @jmcook1186

I see that the team is working on a plugin hub to house everyone's plugins. Is this going to replace if-unofficial and if-official?

Also, Andrew and I are curious if our plugin has potential to be a built in or an official plugin? We believe it should be shipped in everyone's IF and would amplify every pipelines capabilities. As such, we wanted to know where and if we can have that discussion. It goes without saying that we will need to clean up the code and thoroughly flesh out testing to abide by either repositories standards.

jmcook1186 commented 5 months ago

Hi @josh-swerdlow our current plan is that the public registry will just be a webpage with links to people's plugins - the code itself stays in your own repository/npm package but we make them discoverable to other users via the registry.

We will be looking again at all the hackathon submissions and reaching out to projects whose plugins we might want to bring in to the main project code base. There are a lot of great projects but not all of them make sense to merge into the main IF source.

Agree your branching feature is very useful though - we just want to think over how it fits in with our feature roadmap and look at it in the context of the other submisisons - bear with us, it might take @jawache and I a couple of weeks to get our heads around what we want to bring in, but we will be pro-active in reaching out.

josh-swerdlow commented 5 months ago

Thanks, Joseph! Andrew and I would be excited to see this in builtin. While the core team thinks everything through, we will continue to improve on the functionality, documentation, etc.