wri / global-power-plant-database

A comprehensive, global, open source database of power plants
315 stars 96 forks source link

Is this dead? #29

Open andyfurniss4 opened 3 years ago

andyfurniss4 commented 3 years ago

Is this database still being maintained and updated? It looks like there hasn't been any changes to this repo for over a year now and the last data release was over two years ago.

andyfurniss4 commented 3 years ago

Sorry, incorrectly labelled as a bug.

loganbyers commented 3 years ago

Hello -

Short answer is "kind of"? This started as a research project and had goals to be something that was very dynamic and always being updated. As you have found, there aren't really many updates being performed, and this is mostly because of the time capacity I have to do the desk research, maintain information world wide, and incorporate small facts into the knowledge system.

The project has been primarily supported by grants with some cases of people sharing sources or updates. At the end of May we will be totally out of funding for anything related to this work (we spend time and energy on things that aren't just data updates). We will be putting out another data update at that time, but it's going to be fairly inadequate given all the changes to the energy system that are happening every day. There are some irons in the fire, but the earliest we could commit resources through our organization will probably be at the end of the year or early 2022. Any progress or work starting in June will be me volunteering or working in a hobby capacity. I'm not saying I won't be doing that, but it's likely to be even more sporadic.

Unfortunately the architecture that was developed early in the project has proven to be a bad choice and is really limiting the ability to get information into the database. The choice to not use a relational database or some managed database has resulted in some major fragility and loads of technical debt. Any change or update can actually be quite burdensome and is usually just some hacky patch to keep the status quo but add on some new mis-shapen subprocess. In many cases (for the 'automatic data sources') the way the unique plant IDs were defined permits them to change at the whim of the underlying dataset. This was fine during the initial development of the database when everything was in flux, but it's now a nagging burden to ensure that the plants keep the same ID over time.

There will be another update coming this month (end of May 2021), which will likely be the last for the foreseeable future. At that time we will update the readme with the status of the project. I plan on writing and sharing some sort of postmortem or lessons learned document - that's unlikely to be ready by end of May, but maybe June or July. I still completely believe in the mission and goals of this work, the database is widely used and appreciated, but our transition from essentially a prototype to something that was supposed to be production ready failed. We didn't make the jump successfully, we've just continued with the decisions from the prototype and are suffering the consequences.

andyfurniss4 commented 3 years ago

Hi Logan,

Firstly, thank you very much for taking the time to write such a detailed response - I very much appreciate it.

Whilst it is a shame to hear the the project is effectively coming to a close (for now at least), I do understand the position you're in. I can also understand than the amount of work to maintain such a database must be huge with all the different sources, formats and languages involved. I can see how not using a relational database for this kind of project may have become a major problem as more and more data and sources are introduced. Perhaps that will need a rework with any future work that happens on the project. I will keep my fingers crossed for future funding/dedicated resource on this.

I do think that this is an incredibly valuable project as I'm not aware of such an extensive, centralised source of this data. You've done a great job to get to the stage you have with it, even if you may have done things differently if you were given the opportunity. With the world in the state it's in, I don't think it's possible to overstate the importance of being able to take a global view of how well we are/aren't doing in terms of our sustainability. Am I right in assuming that there isn't anything comparable to this project that you might recommend as an alternative for now?

I will keep my eyes open for the update at the end of May and the postmortem details and if you need volunteers in future then I'd be keen to help out in any way I may be able to do so. I am primarily a .NET developer but I have a bit of experience with Python and I've used various relation database technologies if you decided to go down that route in a future iteration.

Thanks again.

comready commented 3 years ago

Hello Logan,

I am glad to hear you intend to publish the lessons learned. As well as the decisions which turned out to be wrong there is obviously a lot which went right. Hopefully the document and your achievements here will be useful to other projects such as Wikidata climate change and Climate Trace. Thanks for all your hard work so far and good luck.

matteodefelice commented 3 years ago

Will you update this repository with the version 1.3?

loganbyers commented 3 years ago

Hi Matteo - yes this will be updated with version 1.3 in the next few days. The challenge has been documenting the processes and flow between this repo and our separate generation estimation repo.

The flow of information is kind of wonky and very stateful. This repo builds the core set of information, then it gets passed over to another set of scripts/models to either estimate plant generation or pull from known values (since estimating some types of plants can be very slow). Then the core "observations" and estimated generation are merged, which constitutes the "final" database. This database then needs to be copied back into this repository. None of this is really automated and it partially breaks some of the internal replicability that we have had to date...

Jfriedrich commented 3 years ago

Yes, soon!

[World Resources Institute | 4 returns] Johannes Friedrich Senior Associate and Manager of Climate Watchhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.climatewatchdata.org%2F&data=04%7C01%7CJohannes.Friedrich%40wri.org%7C0d683ed584f443ef10e708d8a35fac17%7C476bac1f36b24ad98699cda6bad1f862%7C0%7C0%7C637438978598233772%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=9qMK1mmPVZiP6avqYXrGjEjxIIR%2BqcqPy2JvSRZS384%3D&reserved=0 World Resources Institute

WRI is a global research organization that turns big ideas into action at the nexus of environment, economic opportunity and human well-being. Africa | Brazil | China | Europe | India | Indonesia | Mexico | United States

From: Matteo De Felice @.> Sent: Wednesday, June 30, 2021 9:44 AM To: wri/global-power-plant-database @.> Cc: Subscribed @.***> Subject: Re: [wri/global-power-plant-database] Is this dead? (#29)

Will you update this repository with the version 1.3?

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fwri%2Fglobal-power-plant-database%2Fissues%2F29%23issuecomment-871416989&data=04%7C01%7CJohannes.Friedrich%40wri.org%7C18cd14f84b424eda731408d93bcd2372%7C476bac1f36b24ad98699cda6bad1f862%7C0%7C0%7C637606574552387916%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=9dw9vjMHe0di23vmwpGsRZMcw3o26Z1NkkQo57u1f7Q%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABG43REPWAW2WTKEJSDUW3TTVMNSTANCNFSM436W647A&data=04%7C01%7CJohannes.Friedrich%40wri.org%7C18cd14f84b424eda731408d93bcd2372%7C476bac1f36b24ad98699cda6bad1f862%7C0%7C0%7C637606574552397914%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=QnU7FZA7UyYz5tDyx4ZnxfiD8f45shGQn27X2N%2FL5CQ%3D&reserved=0.

MichaelTiemannOSC commented 2 years ago

So...version 1.3 was indeed released: https://wri-dataportal-prod.s3.amazonaws.com/manual/global_power_plant_database_v_1_3.zip

Context: https://datasets.wri.org/dataset/globalpowerplantdatabase

However, neither the readme (which lists 1.2.0 as the latest) nor the releases (which lists 1.1.0 as the latest) mention 1.3.0. So...the good news is that there is a ton of great new data in the June 2021 release. It's just invisible if you've been following github.