Closed ulfmueller closed 2 months ago
I also just ran into this; frustratingly you do get an excel file (so the retrieve rule succeeds), but the excel file contains the following content:
It looks like globalenergymonitor.org isn't so happy about lots of people downloading this file; I would either contact them about it, or, more likely, just mirror the file somewhere where the PyPSA project has control over it.
xref from #1265
The dataset links have cookie-based anti-bot protection against automated downloads, which went unnoticed when adding them to the workflow.
I have reached out to Global Energy Monitor to see if they would be willing to offer an official Zenodo repository (or similar).
The TUBcloud is used as an intermediary solution.
If we do not get the official Zenodo repository, the CC-BY 4.0 license permits redistribution on a Zenodo mirror provided attribution is given. This would be a long-term solution but requires updates from us to the latest GEM datasets once in a while.
Checklist
master
branchpypsa-eur
environment.Describe the Bug
The Global-Steel-Plant-Tracker-April-2024-Standard-Copy-V1.xlsx is not downloaded properly from globalenergymonitor (it is basically an empty file, with a notice that the access is restricted). Problem might be related to #1125 and #1233 .
Error Message
ERROR:root:Uncaught exception Traceback (most recent call last): File "/home/ulf/github/pypsa-eur/.snakemake/scripts/tmpmv6n0gxa.build_industrial_distribution_key.py", line 406, in
gem = prepare_gem_database(regions)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ulf/github/pypsa-eur/.snakemake/scripts/tmpmv6n0gxa.build_industrial_distribution_key.py", line 144, in prepare_gem_database
df = pd.read_excel(
^^^^^^^^^^^^^^
File "/home/ulf/Downloads/yes/envs/pypsa-eur/lib/python3.12/site-packages/pandas/io/excel/_base.py", line 495, in read_excel
io = ExcelFile(
^^^^^^^^^^
File "/home/ulf/Downloads/yes/envs/pypsa-eur/lib/python3.12/site-packages/pandas/io/excel/_base.py", line 1554, in init
raise ValueError(
ValueError: Excel file format cannot be determined, you must specify an engine manually.