cncf / tag-env-sustainability

🌳🌍♻️ TAG Environmental Sustainability
https://tag-env-sustainability.cncf.io/
Apache License 2.0
236 stars 113 forks source link

[Action] Automate scraping and generating a list of sustainability conferences and sessions #214

Closed guidemetothemoon closed 7 months ago

guidemetothemoon commented 1 year ago

Description

There is a need to track conferences that focus on environmental sustainability, have green tracks, or may be relevant in terms of speaking opportunities/outreach for the Environmental Sustainability TAG. We have had some initial discussions in #59 and this issue is an action that came out of that discussions.

The ideal scenario here would be to have an automated process where a tool/script scrapes sustainability-related sessions and conferences, adds those to the list that is published to the TAG ENV website: https://tag-env-sustainability.cncf.io/events

Regarding the structure of the Events page linked above: we can have a page that provides a list of all the events that are being added by a scraper (with potential human polishing), as well as keep having dedicated event pages that are kind of like blog posts, where we describe in more detail what kind of sessions and tracks related to sustainability are part of the event (like we already did for KubeCon + CloudNativeCon EU and Open Source Summit).

Regarding the technical implementation: there are possibilities for automating the scraping process here that we need to look into.

One possibility here was kindly shared by @Al-HusseinHameedJasim which is a script that was used to compile a list of talks from KubeCon+CloudNativeCon, Open Source Summit, and their Co-Located event (79 events in total), where at least one of the following keywords was mentioned in the title: energy, sustainability, sustainable, green, carbon, or kepler. Initial discussion on Slack: https://cloud-native.slack.com/archives/C03F270PDU6/p1692026206597439 The script is available here: https://github.com/Al-HusseinHameedJasim/green-talks-scraper We will need to look into the script and see if we can integrate it and adjust to our needs.

We should also have an alternative, human-friendly way to suggest events that are not listed. Not everyone would be comfortable or willing to learn themselves the website structure and the PR process in order to be able to submit an event so there should be a simpler way to submit this information. Such submission form could for example be directly available on the website (in the same way like there are contact forms).

Input

Need input on how we can implement this functionality, as well as who would like to volunteer to make the implementation itself. I'm happy to assist on the technical and functional side of things as well ☺️

Outcome

Automated workflow that continuously aggregates a list of sustainability-related conferences, meetups, events, sessions and adds those to the list in one of the pages in the Events section of the TAG ENV website: https://tag-env-sustainability.cncf.io/events

Todos

wrkode commented 1 year ago

@guidemetothemoon, I've written this program as proposal for this issue. I'm not sure where, but I'd like to create a PR. https://github.com/wrkode/greenscraper. I'm now working on different sources for events.

wrkode commented 1 year ago

Could I get this issue assigned? if not, could I be told how to get this issue assigned?

guidemetothemoon commented 1 year ago

This is fantastic, @wrkode, thank you so much for willing to take lead on this! I've assigned you to this issue.

I think that we would want to have some kind of a workflow in this repo that would do data scraping and update the list of the events so we could either:

  1. Add the tool to this repo (f.ex. create a utils folder to place it in)
  2. Add it to the new repo that's been created for the Green Reviews WG tooling: https://github.com/cncf-tags/green-reviews-tooling. We're working out all the necessary permissions for the respective stakeholders at the moment, but it should be possible to publish a PR to it. We would then need to make it available to use the tool in this repo via GH action for example.

@leonardpahlke wdyt? I lean towards second option so that we gather all the great tooling in one place and I think greenscraper tool can still fall under the WG scope.

wrkode commented 1 year ago

thank you @guidemetothemoon given the multiple options, continue my work on GreenScraper while a decision of direction is taken? Shall I join the next TAG/WG call and discuss it there?

guidemetothemoon commented 1 year ago

@wrkode I think it could be great to bring this up in the next TAG meeting so that you can briefly introduce the tool and what it does and we can align with the group if we can integrate it and if any adjustments or clarifications are required. Can you please add this as an agenda topic for the next meeting that's taking place Wednesday, 4th of October? Meeting notes: https://bit.ly/cncf-tag-env-meeting-notes

Thanks!

wrkode commented 1 year ago

@guidemetothemoon, added to the agenda for October 4th. I'll do some thinking until the meeting, so that I can come propose something as well.

leonardpahlke commented 1 year ago

GreenScraper code donated and merged to https://github.com/cncf-tags/tag-env-tooling

guidemetothemoon commented 7 months ago

@wrkode apologies for this becoming a bit stale due to other priorities, but I would like to step in from the TAG lead side here as a project lead and assist, both technically and logistically in getting this project completed. It's not a big project so we will just need to look at getting some workflow up and running that can use the tool to automatically scrape and populate events + make any eventual adjustments or improvements to the GreenScraper tool, if needed.

It's getting quite hectic towards KubeCon+CloudNativeCon EU now but what do you say if we start the work right after the conference, @wrkode ? I can set up a sync call to get things going and set up a few tasks to get full overview of what's needed.

@Al-HusseinHameedJasim would you like to be a part of this? I know that you have also done quite some work on something similar so I just wanted to include you here in case you would be interested in joining us on this project.

@leonardpahlke we don't need a project proposal for this since it's an ongoing effort, right? I can just add it as a project to our list in a separate PR.

wrkode commented 7 months ago

@wrkode apologies for this becoming a bit stale due to other priorities, but I would like to step in from the TAG lead side here as a project lead and assist, both technically and logistically in getting this project completed. It's not a big project so we will just need to look at getting some workflow up and running that can use the tool to automatically scrape and populate events + make any eventual adjustments or improvements to the GreenScraper tool, if needed.

It's getting quite hectic towards KubeCon+CloudNativeCon EU now but what do you say if we start the work right after the conference, @wrkode ? I can set up a sync call to get things going and set up a few tasks to get full overview of what's needed.

@guidemetothemoon, sounds good to me, what I believe is needed for GreenScraper, is a workflow that launches the scrape and publish the results on a "page?". Anyway, let me know how you want to proceed or/and when you want to talk about it.

Al-HusseinHameedJasim commented 7 months ago

@wrkode apologies for this becoming a bit stale due to other priorities, but I would like to step in from the TAG lead side here as a project lead and assist, both technically and logistically in getting this project completed. It's not a big project so we will just need to look at getting some workflow up and running that can use the tool to automatically scrape and populate events + make any eventual adjustments or improvements to the GreenScraper tool, if needed.

It's getting quite hectic towards KubeCon+CloudNativeCon EU now but what do you say if we start the work right after the conference, @wrkode ? I can set up a sync call to get things going and set up a few tasks to get full overview of what's needed.

@Al-HusseinHameedJasim would you like to be a part of this? I know that you have also done quite some work on something similar so I just wanted to include you here in case you would be interested in joining us on this project.

@leonardpahlke we don't need a project proposal for this since it's an ongoing effort, right? I can just add it as a project to our list in a separate PR.

@guidemetothemoon Yes, I'm interested in getting involved. Let's have the sync call to see how I can best contribute.

leonardpahlke commented 7 months ago

@leonardpahlke we don't need a project proposal for this since it's an ongoing effort, right?

From my understanding, once it is setup there is just some maintenance work todo. @guidemetothemoon lets open a tracking issue for this. So we keep track of the effort :D https://github.com/cncf/tag-env-sustainability/issues/new?assignees=&labels=tracking&projects=&template=project-tracking.md&title=%5BPROJECT+TRACKING%5D+some+descriptive+title. Starting with this after Kubecon sounds good to me! 🙌

guidemetothemoon commented 7 months ago

Tracking issue opened: #345 Closing this issue since further progress will be added to the new issue.