PyPSA / powerplantmatching

Set of tools to combine multiple power plant databases
https://powerplantmatching.readthedocs.io/en/latest/
GNU General Public License v3.0
157 stars 54 forks source link

Adds all latest GEM datasets. #110

Closed Tomkourou closed 1 year ago

Tomkourou commented 1 year ago

Closes #80 and closes #81.

Change proposed in this Pull Request

Adds all the latest datasets from GEM. Including Geothermal, Wind, Solar, Gas & Coal.

Type of change

Checklist

Tomkourou commented 1 year ago

Whilst updating GCPT I noticed that the contributing.rst has wrong information. It suggests running the following function as the last step:

pm.powerplants(update_all=True)

However this now does nothing as update_all has been deprecated. As seen below.

WARNING:powerplantmatching.collection:The following arguments were deprecated and are being ignored: {'update_all'}

@FabianHofmann what do you think about this? Would be good to update the contributing.rst doc.

FabianHofmann commented 1 year ago

Hey @Tomkourou, indeed we should fix this. It should be update = True. Would you be so kind and make a PR?

FabianHofmann commented 1 year ago

And thanks btw. Let me know when this is ready to review :)

Tomkourou commented 1 year ago

And thanks btw. Let me know when this is ready to review :)

Ready to go :)

Tomkourou commented 1 year ago

Really cool! :) Before delving into code optimization some comments in this review.

Additional to the comments, do you think it might make sense to report how the powerplant capacity per carrier changed? Like comparing pre-contribution capacity per carrier vs after contribution.

Here is the analysis with the standard European target_country list: new dataset: ENTSOE, GEO, GPD, JRC, OPSD, BEYONDCOAL, WIKIPEDIA, GGPT, GCPT, GCPT, GGtPT, GSPT, GWPT

original dataset: ENTSOE, GEO, GPD, JRC, OPSD, BEYONDCOAL, WIKIPEDIA, GGPT

Technology new dataset original dataset Percentage difference
Bioenergy 10841.6834 6067.6834 78.7%
Geothermal 806 766 5.2 %
Hard Coal 183210.371796 116945.593026 56.7%
Hydro 214002.872826 214054.872826 0.0%
Lignite 77853.297197 56778.921 37.1%
Natural Gas 219722.632261 230497.597932 -4.7%
Nuclear 138763.1 138391.798212 0.3%
Oil 18454.470346 20862.942634 -11.5%
Other 3728.809314 3695.380957 0.9%
Solar 25663.803 3778.903 579.1%
Waste 2950.606334 2950.606334 0.0%
Wind 208877.81 24401.58 756.0%
image

(updated with Nuclear and Biomass)

pz-max commented 1 year ago

Hi @Tomkourou, The code looks good. I just added nuclear and bioenergy here: https://github.com/Tomkourou/powerplantmatching/pull/1. (Please merge the PR if you think it is ok, so it will appear here too). As minor detail, I changed GGtPT to GGTPT for naming consistency. Most of our data names are by default in capital letter. Any dev will think here its a bug even though its named like GGtPT officially by GEM.

@Tomkourou , could you update your above plot with the new nuclear and bioenergy data?

@FabianHofmann, in case you start reviewing, one minor remaining question. Given a config.yaml entry like:

GSPT:
  net_capacity: false
  reliability_score: 4
  fn: Global-Solar-Power-Tracker-January-2023.csv
  url: https://raw.githubusercontent.com/pz-max/gem-powerplant-data/main/Global-Solar-Power-Tracker-January-2023.csv

Should we add something like aggregated_units: true? (some powerplants like nuclear have the same name for different powerplant units. Does that lead to conflicts? Will aggregated_units: true help?). Probably we could test that @Tomkourou :+1:

Tomkourou commented 1 year ago

Hi @Tomkourou, The code looks good. I just added nuclear and bioenergy here: https://github.com/Tomkourou/powerplantmatching/pull/1. (Please merge the PR if you think it is ok, so it will appear here too). As minor detail, I changed GGtPT to GGTPT for naming consistency. Most of our data names are by default in capital letter. Any dev will think here its a bug even though its named like GGtPT officially by GEM.

@Tomkourou , could you update your above plot with the new nuclear and bioenergy data?

@FabianHofmann, in case you start reviewing, one minor remaining question. Given a config.yaml entry like:

GSPT:
  net_capacity: false
  reliability_score: 4
  fn: Global-Solar-Power-Tracker-January-2023.csv
  url: https://raw.githubusercontent.com/pz-max/gem-powerplant-data/main/Global-Solar-Power-Tracker-January-2023.csv

Should we add something like aggregated_units: true? (some powerplants like nuclear have the same name for different powerplant units. Does that lead to conflicts? Will aggregated_units: true help?). Probably we could test that @Tomkourou :+1:

That's a good question. Until now I didn't have any problems with this but that was because I was using the GEM unit ids or phase ids for renewables instead of project ids. If you use projectids it will probably throw an error since there will be duplicates.

pz-max commented 1 year ago

@Tomkourou Will be reviewed on Monday :+1:

FabianHofmann commented 1 year ago

thanks @Tomkourou! looks all very nice!

FabianHofmann commented 1 year ago

@Tomkourou one question: You added that the GEM datasets are not reporting net capacity. Do you remember where you got this information from? I cannot find it on the GEM page. Since PPM should report net capacity, this might lead to some issues if we do not rescale the capacities. Thanks!

Tomkourou commented 1 year ago

@FabianHofmann it's a good question. My understanding was that since most GEM datasets include multiple units for powerplants then they needed to be summed up in the process and that's why I categorised them as not reporting net capacity.

Perhaps I misunderstood the meaning of net capacity though and it needs to be recategorised?

FabianHofmann commented 1 year ago

@Tomkourou thanks for your answer! Actually, the correct tag for that would be aggregated_units. But this is no problem at all, since the data is effectively treated as net capacity in the code, which then correct and makes things easier. I will just remove the net_capacity tags from the config.