pypsa-meets-earth / pypsa-earth

PyPSA-Earth: A flexible Python-based open optimisation model to study energy system futures around the world.
https://pypsa-earth.readthedocs.io/en/latest/
226 stars 177 forks source link

Problem with build_powerplants #358

Closed hazemakhalek closed 2 years ago

hazemakhalek commented 2 years ago

Checklist

Describe the Bug

The issue appear even with a new clean repo and environment.

Error Message

If applicable, paste any terminal output to help illustrating your problem. In some cases it may also be useful to share your list of installed packages: conda list.

<rule build_powerplants:
    input: networks/base.nc, configs/powerplantmatching_config.yaml, data/custom_powerplants.csv, data/clean/africa_all_generators.csv
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log
    jobid: 16
    reason: Missing output files: resources/powerplants.csv
    resources: tmpdir=/tmp, mem=500
INFO:snakemake.logging:rule build_powerplants:
    input: networks/base.nc, configs/powerplantmatching_config.yaml, data/custom_powerplants.csv, data/clean/africa_all_generators.csv
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log
    jobid: 16
    reason: Missing output files: resources/powerplants.csv
    resources: tmpdir=/tmp, mem=500

INFO:snakemake.logging:
INFO:pypsa.io:Imported network base.nc has buses, lines
INFO:powerplantmatching.collection:Create combined dataset for GEO, GPD
Traceback (most recent call last):
  File "/home/user/PyPSA_models/pypsa-africa/.snakemake/scripts/tmpnwsb7y6r.build_powerplants.py", line 251, in <module>
    pm.powerplants(from_url=False, update=True, config=config)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/collection.py", line 223, in matched_data
    matched = collect(matching_sources, config=config, **collection_kwargs)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/collection.py", line 96, in collect
    dfs = parmap(df_by_name, datasets)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/utils.py", line 378, in parmap
    return list(map(f, arg_list))
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/collection.py", line 71, in df_by_name
    df = get_df(config=config)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/data.py", line 297, in GEO
    res = scale_to_net_capacities(res)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/heuristics.py", line 586, in scale_to_net_capacities
    factors = gross_to_net_factors()
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/heuristics.py", line 557, in gross_to_net_factors
    df.energy_source_level_2.fillna(value=df.energy_source, inplace=True)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/pandas/core/generic.py", line 5575, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'energy_source'
[Tue May 31 15:49:13 2022]
INFO:snakemake.logging:[Tue May 31 15:49:13 2022]
Error in rule build_powerplants:
    jobid: 16
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log (check log file(s) for error message)

ERROR:snakemake.logging:Error in rule build_powerplants:
    jobid: 16
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log (check log file(s) for error message)

RuleException:
CalledProcessError in line 329 of /home/user/PyPSA_models/pypsa-africa/Snakefile:
Command 'set -euo pipefail;  /home/user/anaconda3/envs/pypsa-africa/bin/python3.10 /home/user/PyPSA_models/pypsa-africa/.snakemake/scripts/tmpnwsb7y6r.build_powerplants.py' returned non-zero exit status 1.
  File "/home/user/PyPSA_models/pypsa-africa/Snakefile", line 329, in __rule_build_powerplants
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/concurrent/futures/thread.py", line 58, in run
ERROR:snakemake.logging:RuleException:
CalledProcessError in line 329 of /home/user/PyPSA_models/pypsa-africa/Snakefile:
Command 'set -euo pipefail;  /home/user/anaconda3/envs/pypsa-africa/bin/python3.10 /home/user/PyPSA_models/pypsa-africa/.snakemake/scripts/tmpnwsb7y6r.build_powerplants.py' returned non-zero exit status 1.
  File "/home/user/PyPSA_models/pypsa-africa/Snakefile", line 329, in __rule_build_powerplants
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/concurrent/futures/thread.py", line 58, in run
Removing output files of failed job build_powerplants since they might be corrupted:
resources/powerplants_osm2pm.csv
WARNING:snakemake.logging:Removing output files of failed job build_powerplants since they might be corrupted:
resources/powerplants_osm2pm.csv
Shutting down, this might take some time.
WARNING:snakemake.logging:Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
ERROR:snakemake.logging:Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-05-31T154906.277052.snakemake.log
WARNING:snakemake.logging:Complete log: .snakemake/log/2022-05-31T154906.277052.snakemake.log>
hazemakhalek commented 2 years ago

@energyLS @davide-f

davide-f commented 2 years ago

@Hazem-IEG That's the same issue experienced in #349 ; what is the country you are testing? PR #359 solves the issue for DRC (country @ekatef was testing), yet I'm not sure if the problem (hence the needed fix) is the same

pz-max commented 2 years ago

@Hazem-IEG thanks it fixed also now my new environment :)

Just one note. The energy_source is indeed still in the newest powerplantmatching version: https://github.com/FRESNA/powerplantmatching/blob/b0e5a05773b88d40e99f73fd28606cdc6ea3b240/powerplantmatching/heuristics.py#L543-L545

Just as reminder, we actually work with a fork from @davide-f (https://github.com/davide-f/powerplantmatching/tree/new_pypsa_africa). I am adding a fix there in the meantime.

davide-f commented 2 years ago

I merged the issue, but have you tested that it works with that fix? Why did the CI work and in your case not? Have you crosschecked that?

I'll work a bit on that as well

Update Moving energy_source to energy_source_level1 may lead to unexpected behaviors: in my debugging, both energy_source and energy_source_level1 are available: I reverted the PR

davide-f commented 2 years ago

I've uninstalled the environment and reinstalled it. Unfortunately, I cannot reproduce the error on the tutorial: could you better explain how to reproduce it?

pz-max commented 2 years ago

Hey Carlos, Graphviz just got added in the last 3 days. See environment.yaml. We also added the fix to davides fork so people don't suffer from the energy_source issue. :)

Get Outlook for iOShttps://aka.ms/o0ukef


From: carlosfv92 @.> Sent: Sunday, June 5, 2022 10:08:07 PM To: pypsa-meets-africa/pypsa-africa @.> Cc: PARZEN Maximilian @.>; Comment @.> Subject: Re: [pypsa-meets-africa/pypsa-africa] Problem with build_powerplants (Issue #358)

This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe.

Hi everyone! I just experienced this very same error while trying to run a model for Argetina, Bolivia and Peru (AR, BO, PE) from scratch. I applied the fix proposed by @Hazem-IEGhttps://github.com/Hazem-IEG and the error has dissappear 💯 !!!

I don't think I did anything different than with other cases (using other countries in Africa or running the model only for AR in south america) however this was the first time I got this specific error. Given that I erased the local repo and environment a couple of times (just to discard the problem was a "bad installation") I think you could use the steps I followed while making the model to reproduce the error (I'm working on windows) :

-I made a local clone of the pypsa-africa repo from github (02/06/22) -created the pypsa-africa environment (had to add the graphviz library with conda which was missing) -I copied the config.default.yaml file as a base for the config.yaml file and changed the following parameters: countries: ["AR","BO","PE"] scenario: simpl: [''] ll: ['copt'] clusters: [10] opts: [Co2L-3H] *enable: retrieve_databundle: true download_osm_data: true build_natura_raster: true build_cutout: true -I used the command "snakemake -j 1 solve_all_networks --forceall"

— Reply to this email directly, view it on GitHubhttps://github.com/pypsa-meets-africa/pypsa-africa/issues/358#issuecomment-1146883611, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOYZENOPXNF37OZTVA5UCCDVNUJLPANCNFSM5XNVD2QA. You are receiving this because you commented.Message ID: @.***>

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh DhĂčn Èideann, clĂ raichte an Alba, Ă ireamh clĂ raidh SC005336.

davide-f commented 2 years ago

Max, I reverted the issue because that needs to be investigated better. I debugged on my local repo with a working workflow and both energy_source and energy_source_level_1 columns, moreover, the contents of the two columns do not match. Hence we cannot use the latter column instead of the former. The issue must be somewhere else; I'd like to do that, but I need the error to be reproducible unfortunately, I have now the example I asked to Carlos to maybe try to debug it :)

davide-f commented 2 years ago

P.S. having the datasources down may play a role as well...

pz-max commented 2 years ago

Ahh good to know. Would make sense. Btw. where do you see the GEO status? Can you share the link?

Get Outlook for iOShttps://aka.ms/o0ukef


From: Davide Fioriti @.> Sent: Sunday, June 5, 2022 11:44:31 PM To: pypsa-meets-africa/pypsa-africa @.> Cc: PARZEN Maximilian @.>; Comment @.> Subject: Re: [pypsa-meets-africa/pypsa-africa] Problem with build_powerplants (Issue #358)

This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe.

P.S. having the datasources down may play a role as well...

— Reply to this email directly, view it on GitHubhttps://github.com/pypsa-meets-africa/pypsa-africa/issues/358#issuecomment-1146896581, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOYZENOBAG4VRM2QSDOLMEDVNUUU7ANCNFSM5XNVD2QA. You are receiving this because you commented.Message ID: @.***>

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh DhĂčn Èideann, clĂ raichte an Alba, Ă ireamh clĂ raidh SC005336.

davide-f commented 2 years ago

This is the link that is crashing: https://vfs.fias.science/f/3f4cc3876f/?raw=1

davide-f commented 2 years ago

Now the host of GEO and GPD is back on :) we go back to the normal CI

davide-f commented 2 years ago

@Hazem-IEG and @carlosfv92 I still cannot reproduce the issue unfortunately. I am unsure whether this issue is due to corrupted input files stored by powerplantmatching.

To eliminate such option, I'd recommend the following procedure:

  1. [just to be sure] clean and update the powerplantmatching installation
    pip uninstall powerplantmatching
    pip install git+https://github.com/davide-f/powerplantmatching.git@new_pypsa_africa#egg=powerplantmatching
  2. manually reset the datafiles stored by powerplantmatching To do so, I'd recommend to look for the data folder of powerplantmatching and manually delete it. In linux, you may try find /home/{username} -name global_energy_observatory_power_plants.csv In my case, the path is /home/davidef/.local/share/powerplantmatching/data/in/global_energy_observatory_power_plants.csv. Once you have found the path, please delete entirely the poweplantmatching folder, in my case it would be rm -r /home/davidef/.local/share/powerplantmatching

Then, please try to execute the workflow again and write here if there are news.

In the last days, the sever where GEO and GPD data are stored was offline, not sure if this has led somehow to issues.

davide-f commented 2 years ago

@Hazem-IEG @carlosfv92 , is this still an issue or can we close it?

hazemakhalek commented 2 years ago

It's done for me

davide-f commented 2 years ago

@Hazem-IEG Super! Did the fix above work? Just asking for validation so that in the case it happens again, we can reference this issue. I will close this issue after the answer

hazemakhalek commented 2 years ago

Works fine after I follow option 2 here:

@Hazem-IEG and @carlosfv92 I still cannot reproduce the issue unfortunately. I am unsure whether this issue is due to corrupted input files stored by powerplantmatching.

To eliminate such option, I'd recommend the following procedure:

1. [just to be sure] clean and update the powerplantmatching installation
 pip uninstall powerplantmatching
 pip install git+https://github.com/davide-f/powerplantmatching.git@new_pypsa_africa#egg=powerplantmatching
2. manually reset the datafiles stored by powerplantmatching
   To do so, I'd recommend to look for the data folder of powerplantmatching and manually delete it.
   In linux, you may try `find /home/{username} -name global_energy_observatory_power_plants.csv`
   In my case, the path is `/home/davidef/.local/share/powerplantmatching/data/in/global_energy_observatory_power_plants.csv`.
   Once you have found the path, please delete entirely the poweplantmatching folder, in my case it would be `rm -r /home/davidef/.local/share/powerplantmatching`

Then, please try to execute the workflow again and write here if there are news.

In the last days, the sever where GEO and GPD data are stored was offline, not sure if this has led somehow to issues.

davide-f commented 2 years ago

Thank you hazem for confirmation. I'll close the issue then

@carlosfv92 , if you still experience the same issue, I recommend to do as described. In case that doesn't solve your issue, please post again and we reopen this issue.

pz-max commented 2 years ago

Welcome back error. Hazem and @davide-f suggestions didn't help. Running on a fresh pypsa-africa installation, brand new environment and the config.test1.yaml. Installed the environment with mamba ... Installing with conda itself was taking more than 60min (stopped it). So that's why mamba -- Seems we have a general environment issue (to harsh env constraints).

WEIRD is that the CI works. I will check to run everything with miniconda instead of mamba

INFO:snakemake.logging:
INFO:pypsa.io:Imported network base.nc has buses, lines, transformers
INFO:powerplantmatching.collection:Create combined dataset for GEO, GPD
INFO:powerplantmatching.core:Retrieving data from https://vfs.fias.science/f/b4607c76b4/?raw=1
Traceback (most recent call last):
  File "/home/max/OneDrive/PHD-Flexibility/07_pypsa-africa/0github/pypsa-africa/uncertainty-esm/pypsa-africa/.snakemake/scripts/tmpxe943z_n.build_powerplants.py", line 260, in <module>
    pm.powerplants(from_url=False, update=True, config=config)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/collection.py", line 223, in matched_data
    matched = collect(matching_sources, config=config, **collection_kwargs)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/collection.py", line 96, in collect
    dfs = parmap(df_by_name, datasets)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/utils.py", line 378, in parmap
    return list(map(f, arg_list))
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/collection.py", line 71, in df_by_name
    df = get_df(config=config)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/data.py", line 303, in GEO
    res = scale_to_net_capacities(res)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/heuristics.py", line 586, in scale_to_net_capacities
    factors = gross_to_net_factors()
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/heuristics.py", line 557, in gross_to_net_factors
    df.energy_source_level_2.fillna(value=df.energy_source, inplace=True)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/pandas/core/generic.py", line 5575, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'energy_source'
davide-f commented 2 years ago

@pz-max the environment has always taken long to install in my case but I never measured it. The error we are experiencing may be an environment problem as the CI works. ppl needs some input files; when using mamba, are you sure you deleted the right input files when making the suggested change? If you have both miniconda and mamba installed you may have multiple folders with such inputs [not sure though].

BTW, we need a reproducible procedure to be able to reproduce it. have you tested the mentioned procedure from clean and/or using a different pc?

pz-max commented 2 years ago

It worked now.

I used mamba which just took 10min to install (conda install takes at least a couple of hours):

pz-max commented 1 year ago

Deleted some responses to avoid wrong answers. Thanks @EmreYorat for reporting this confusion

carlosfv92 commented 1 year ago

Hi everyone! after a while this problem showed up again so I thought I could share the "fix" I found: 1) Change the environment.yaml file to install the most recent version on the ppm by adding a line on the file after the pip command on line 78 "- git+https://github.com/pypsa/powerplantmatching@master" and removing the powerplantmatching line after line 15, 2) Then, create the environment is created and find the local ppm folder created in your computer and delete it. In my case, it was on "C:\Users\Lenovo\AppData\Roaming\powerplantmatching" 3) Force snakemake to run the entire workflow from the beginning using "snakemake -j 1 solve_all_networks".

pz-max commented 1 year ago

Hi everyone! after a while this problem showed up again so I thought I could share the "fix" I found:

1. Change the environment.yaml file to install the most recent version on the ppm by adding a line on the file after the pip command on line 78 "- git+https://github.com/pypsa/powerplantmatching@master" and removing the powerplantmatching line after line 15,

2. Then, create the environment is created and find the local ppm folder created in your computer and delete it. In my case, it was on "C:\Users\Lenovo\AppData\Roaming\powerplantmatching"

3. Force snakemake to run the entire workflow from the beginning using "snakemake -j 1 solve_all_networks".

Problem was that we required a new release for powerplantmatching since only the master of ppm was working for us. Davide just added now a new release. We hope this issue is gone for a while. Thanks for reporting a solution @carlosfv92. This will help anyone experience a similar issue in future