pypsa-meets-earth / pypsa-earth

PyPSA-Earth: A flexible Python-based open optimisation model to study energy system futures around the world.
https://pypsa-earth.readthedocs.io/en/latest/
233 stars 190 forks source link

Error downloading bundle_cutouts_asia #1078

Closed choiHenry closed 3 months ago

choiHenry commented 3 months ago

Checklist

Describe the Bug

I got this error message for the default configuration file with just one slight change countries: ["KR"]

snakemake -j 1 solve_all_networks

I know there's been similar issues like #911 but I don't know how to solve this problem.

I am running on Mac ARM(M1) machine. But I don't think this problem depends on machine characteristics since I got the same error message on the linux machine with the same config file.

Error Message

Error in downloading bundle bundle_cutouts_asia - host gdrive
WARNING:__main__:Error in downloading bundle bundle_cutouts_asia - host gdrive
Bundle bundle_cutouts_asia cannot be downloaded
ERROR:__main__:Bundle bundle_cutouts_asia cannot be downloaded
Merging regional hydrobasins files into a global shapefile
INFO:__main__:Merging regional hydrobasins files into a global shapefile
Merging hydrobasins files into: data/hydrobasins/hybas_world.shp
INFO:__main__:Merging hydrobasins files into: data/hydrobasins/hybas_world.shp
9it [00:08,  1.12it/s]
Bundle successfully loaded and unzipped:
    bundle_data_earth
    bundle_landcover_earth
    bundle_natura_earth
    bundle_hydrobasins
INFO:__main__:Bundle successfully loaded and unzipped:
    bundle_data_earth
    bundle_landcover_earth
    bundle_natura_earth
    bundle_hydrobasins
The following bundles could not be downloaded:
    bundle_cutouts_asia
WARNING:__main__:The following bundles could not be downloaded:
    bundle_cutouts_asia
Waiting at most 5 seconds for missing files.
MissingOutputException in rule retrieve_databundle_light in file /Users/choidamian/PyPSA/pypsa-earth/Snakefile, line 154:
Job 9  completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
cutouts/cutout-2013-era5.nc
Removing output files of failed job retrieve_databundle_light since they might be corrupted:
data/ssp2-2.6/2030/era5_2013/Europe.nc, data/hydrobasins/hybas_world.shp, data/gebco/GEBCO_2021_TID.nc, data/eez/eez_v11.gpkg, data/ssp2-2.6/2030/era5_2013/Africa.nc, data/copernicus/PROBAV_LC100_global_v3.0.1_2019-nrt_Discrete-Classification-map_EPSG-4326.tif, data/natura/natura.tiff, data/ssp2-2.6/2030/era5_2013/SouthAmerica.nc, data/ssp2-2.6/2030/era5_2013/Oceania.nc, data/ssp2-2.6/2030/era5_2013/Asia.nc, data/ssp2-2.6/2030/era5_2013/NorthAmerica.nc, data/landcover
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-08-09T195442.359866.snakemake.log

Thanks in advance.

choiHenry commented 3 months ago

I think the cause of this issue is that "https://drive.google.com/file/d/11-Ax9tVks7oPjrZwG5v3C0x4OmMT_Pv8/view?usp=sharing"(bundle_cutouts_asia) points not directly to the file but to the page with message "Google Drive can't scan this file for viruses". So I think the temporary action to fix this problem is to download the file manually, unzip and save it to /foo/bar/pypsa-earth/cutouts/cutout-2013-era5.nc (am I correct?) and run using snakemake -j 1 solve_all_networks. The fundamental way could be using independent repo like zenodo to save the file.

davide-f commented 3 months ago

Hello :D Thanks for posting! The fork of the Google drive downloader already handles that but gdrive limits the number of downloads a day to very few. It is likely that if you launch few times the rule, the limit is reached.

A bypass for that is to:

  1. Run the script without snakemake using vscode or simple python: conda activate pypsa-earth python retrieve_databundlelight.py
  2. Download manually the cutout and unzip it in cutouts folder
  3. Disable the rule retrieve_databundle in the config file by setting enable->retrieve...: False

This should fix your issue.

choiHenry commented 3 months ago

Thank you very much for your kind response Professor Davide,

I followed your suggestions and running the model right now:)

ekatef commented 3 months ago

Hello @choiHenry @davide-f, thank you so much for working on that. Unfortunately, we are not yet able to eliminate the problem completely, but you have created a perfect guidance on how to deal with that ❤️

choiHenry commented 3 months ago

I run around 10-11 am in Korean Time(GMT +9), and I was able to retrieve the cutouts bundle finally. I see the problem is really in the gdrive limits.

conda activate pypsa-earth
cd /path/to/pypsa-earth/scripts
python retrieve_databundle_light.py

Thank you @ekatef @davide-f.