pypsa-meets-earth / pypsa-earth

PyPSA-Earth: A flexible Python-based open optimisation model to study energy system futures around the world.
https://pypsa-earth.readthedocs.io/en/latest/
225 stars 177 forks source link

Test all continents #795

Open davide-f opened 1 year ago

davide-f commented 1 year ago

Describe the feature you'd like to see

PyPSA-Earth shall be tested to execute all continents. This means (a) having the cutouts for all continents and (b) locally test the execution of the default configuration for all continents.

Anyone interested in addressing a continent, feel free to drop a message and book the continent :D

Needed cutouts:

Regions tested

davide-f commented 1 year ago

Some countries span across the +-180 degress (e.g. RU, US, NZ) that leads the cutout generation to download slices of the world. That means that the prepared regions of the cutout may be created in such a way to cover the entire world while trying to be "small". For example, the NorthAmerica databundle contains Russia and Europe. Anyone needing Russia, should anyway download a slice +-180° latitude for russia, hence it may makes sense to use the same cutout. On the other hand, in Asia, including all countries including Russia would mean to download nearly the entire Earth dataset. So, as a cutout for Asia, it may make sense to create 1/2 asia cutouts, one maybe for Asia bur russia (unfortunately...). Those willing to simulate the entire Asia may use the Earth cutout. Other proposals are welcome and we can accommodate more cutouts as well

ekatef commented 1 year ago

Some countries span across the +-180 degress (e.g. RU, US, NZ) that leads the cutout generation to download slices of the world. That means that the prepared regions of the cutout may be created in such a way to cover the entire world while trying to be "small". For example, the NorthAmerica databundle contains Russia and Europe. Anyone needing Russia, should anyway download a slice +-180° latitude for russia, hence it may makes sense to use the same cutout. On the other hand, in Asia, including all countries including Russia would mean to download nearly the entire Earth dataset. So, as a cutout for Asia, it may make sense to create 1/2 asia cutouts, one maybe for Asia bur russia (unfortunately...). Those willing to simulate the entire Asia may use the Earth cutout. Other proposals are welcome and we can accommodate more cutouts as well

Sounds like a great approach. As a suggestion, could it probably make sense to define Asia cutout as SouthAsia and centre it around India?

shanks847 commented 1 year ago

I'm tackling validation in the Hydro Project and was recommended to start with creating a cutout for Oceania.

ekatef commented 1 year ago

I'm tackling validation in the Hydro Project and was recommended to start with creating a cutout for Oceania.

@shanks847 you mean validation for New Zealand, right?

shanks847 commented 1 year ago

Yes, my apologies. I'm working on the NZ validation and wanted to embark on the creation of the Oceania cutout.

davide-f commented 1 year ago

That's awesome! :D You are welcome :)

The procedure to create the cutout is available here Please use countries: ["Oceania"] in the config, using the default configuration.

Once the cutout is ready, you can use the script https://github.com/pypsa-meets-earth/pypsa-earth/blob/main/scripts/non_workflow/zip_folder.py Using the commented line zipFilesInDir("./cutouts", "cutouts.zip", lambda x: True, include_parent=False)

The final zipfile can be uploaded into google drive and shared to us; we will provide to upload it and provide the link for the PR :)

davide-f commented 1 year ago

@ekatef As discussed during the monthly meeting, it makes sense to create a "Asia" like SouthAsia, excluding Russia because otherwise the cutout becomes too big. Some tests may be performed to understand how many countries to include. We may have a cutout for Central and Western Asia and have another cutout for the rest of Asia.

ekatef commented 1 year ago

@ekatef As discussed during the monthly meeting, it makes sense to create a "Asia" like SouthAsia, excluding Russia because otherwise the cutout becomes too big. Some tests may be performed to understand how many countries to include. We may have a cutout for Central and Western Asia and have another cutout for the rest of Asia.

@davide-f I think that is great approach :)

Some disk space estimations for Asian regions (non-zipped nc files)

ekatef commented 1 year ago

Have tried to build a cutout for Western+Central+South Asia, and the result is surprisingly tiny: just about 12 Gb. Looking on the map, I'd expect about twice as much as the old "Silk Road" cutout, while there is only 10% addition.

A list of countries obtained by merging definitions of "CASR", "MEAR", "WAS", "SASR" regions from config_osm. Looking into the cutout structure with atlite.Cutout() gives a reasonable extent and time range:

 x = 24.90 ⟷ 101.70, dx = 0.30
 y = 2.10 ⟷ 55.80, dy = 0.30
 time = 2013-01-01 ⟷ 2013-12-31, dt = H

A testing run for "JO" has been successful.

@davide-f do you have other ideas on how correctness of cutout calculations can be tested?

davide-f commented 1 year ago

That's cool! May you make a plot of the total region covered? Since you have the continent shape, just it's plot should be ok.

May you try adding FEAR (far eastern asian region)? I'm wondering if that becomes too large.

ekatef commented 1 year ago

That's cool! May you make a plot of the total region covered? Since you have the continent shape, just it's plot should be ok.

May you try adding FEAR (far eastern asian region)? I'm wondering if that becomes too large.

That is the plot for countries of the region:

image

The countries names list is ["KZ", "KG", "UZ", "TM", "TJ", "MM", "BD", "BT", "NP", "IN", "LK", "PK", "AF", "TR", "AM", "AZ", "BH", "CY", "GE", "IR", "IQ", "IL", "JO", "KW", "LB", "OM", "PS", "QA", "SA", "SY", "AE", "YE"]. Agree that is makes sense to make an attempt to include Eastern Asia into the cutout, as well. Probably, we can also think about continental ASEA countries, too.

ekatef commented 1 year ago

When adding Far East region and ASEAN (except Indonesia), the cutout size is about 20 Gb, while the region looks like follows:

image

My feeling is that it might be a configuration close to optimal... @davide-f what is you feeling about that?

Along the way I have experienced how easy is it to erase a cutout dropping results of quite some hours of computation... 😭 I'm afraid, assuming an invariant name for a cutout will not make things easier in this regard. Would it be probably a good idea to stop generation of a cutout if it implies overwriting an existing one?

davide-f commented 1 year ago

Sad to hear that! :( I understand. If you run the rule using snakemake, unfortunately adding a check in the code is useless most likely: snakemake will delete the output file before executing the rule itself... :(

This cutout is really amazing! please keep it aside that can be considered almost final.

Since you are there, would you kill me if I'd propose to incluse indonesia (SouthEast asia) as well? It seems that it is the only missing region and with that we can have asia covered basically

ekatef commented 1 year ago

@davide-f I'll be very grateful to you for the suggestion to add Indonesia, as it'll mean re-running a cutout for a purpose! 😉

Regarding modification of the workflow: of course! You are completely right that the fundamental reason is Snakemake behaviour. Actually, atlite has a safety-check which prevent from over-writing existing cutout. But probably it's worth considering some workaround even for a price making the Snakemake workflow a bit more complicated?

davide-f commented 1 year ago

You are right... I thought it was included, the idea may be to include south-east asia. Anyway, we could zip the files and provide them. We are close to mapping the world :D

Mmm, we could add an if condition before build_cutout and enable that only when the cutout is not available. However, since it has a wildcard, that's definitely not so obvious.

We may discuss about that but I'm not sure that the advantages are worth complicating things, may you like to add an issue about that?

ekatef commented 1 year ago

You are right... I thought it was included, the idea may be to include south-east asia. Anyway, we could zip the files and provide them. We are close to mapping the world :D

Mmm, we could add an if condition before build_cutout and enable that only when the cutout is not available. However, since it has a wildcard, that's definitely not so obvious.

We may discuss about that but I'm not sure that the advantages are worth complicating things, may you like to add an issue about that?

Ok for including Indonesia and re-running cutout generation 🙂

Yeah, after some playing around build_cutout rule, I can absolutely confirm that a workaround is needed to tackle this. You are totally right that Snakemake approach is making this quite a challenge. I have tried to use a condition to check if the file exists. The condition itself works perfectly, but I don't know yet how to specify a rule/shell "do nothing", as Snakemake workflow first removes the output file, and only after that gets interested what had to be done with it.

Absolutely agree that it doesn't seem a good idea to over-complicate things, but I'm afraid we have to find a solution. Currently we seem to give a perfect example of the Lack of confirmation for destructive actions design anti-pattern 🙃

Have added an issue #818 on that. Thanks a lot for support! ;)

ekatef commented 1 year ago

Hello @davide-f! A cutout for the whole Asia (but Russia) is 26.4 Gb:

image
davide-f commented 1 year ago

Sounds perfect to me :D Would you like to open a PR?

ekatef commented 1 year ago

Sounds perfect to me :D Would you like to open a PR?

Absolutely happy to do! :D Would you mind to share some technical details on where and how the prepared cutout can be uploaded?

I see that we store cutouts on gdrive, but size storage available for there for me is less that we require for the cutouts :) Would it probably work to open write access to one of the folders on the pypsa-earth gdrive?

ekatef commented 8 months ago

We also need a "smaller" cutout for Northern America to comply with zenodo size limitations. The suggested coordinate box includes Canada, Greenland, Central America and United States, except some of Aleutian Islands (an archipelago near Alaska) to avoid crossing 180 meridian.

Thanks @davide-f for updating the list!