pypsa-meets-earth / documentation

Contains hackathon material, jupyter notebook visualizations and images
GNU General Public License v3.0
14 stars 68 forks source link

Add notebook for make_statistics #50

Closed javier-cp6 closed 1 year ago

javier-cp6 commented 1 year ago

Closes #46, closes pypsa-meets-earth/pypsa-earth#608 Added a notebook that explains how to create statistics on the workflow by running the make_statistics rule.

ekatef commented 1 year ago

Hi @javier-cp6, your contribution is absolutely welcome! It would help a lot to have a demo on make_stats.

Adding to @pz-max suggestions regarding supplementing the values in the table with units, I think it would be great to play a bit around output formats, in particular:

1) keep a number of digits reasonable: currently our stats claims to know transmission lines length with nanometers precision with clean_osm_data.3;

2) find a way to deal with big numbers like those for build_shapes.

What do you think?

javier-cp6 commented 1 year ago

Sure. I think I can also group the rules.

davide-f commented 1 year ago

Hello Javier :) Many thanks for the PR and sorry for the delay but Max and Katia have covered the comments. Personally, I really like the PR and I think little updates may be done to improve it. In particular:

javier-cp6 commented 1 year ago

Hi @pz-max @ekatef @davide-f ,

Thanks for your hints. So far, I’ve updated the notebook with the following changes:

I need some help with the units for the stats. The rule does not generate them, so I'm looking in the make_statistics script and the resources folder generated for NG. However, I still need to complete the units for the remaining fields. Once I have completed this task, I will update the naming accordingly and add some cell descriptions.

ekatef commented 1 year ago

Hi @javier-cp6, a very nice work! Sorry for a delay with the answer: a problem of chained deadlines 🙂

My general feeling is the output looks much more clear now.

It seems lines_length for some reasons keeps nine decimal points. Could you please check it?

Regarding adding the units, I'd be happy to help. Agree, that in some cases in could be currently a bit tricky to dig-out the dimensions from the code and docs :) So, investigation on that and adding units to the outputs is definitely worth efforts. Could you please give some updates on which parameters do you need help to find their units?

javier-cp6 commented 1 year ago

Hi @ekatef , Thanks for your response. I've updated the notebook adding more units and descriptions. I've also fixed the lines_length format. I'll keep searching for the rest of units, could you please help me with add_electricty stats such as OCGT and CCGT?

ekatef commented 1 year ago

Hi @javier-cp6, super! Being formatted in this way, the notebook is very handy. My feeling is that your PR is close to be completed.

Values of CCGT:hydro under add_electricity imply installed capacity in MW. Probably, it could also make sense to explain that CCGT = combined cycle gas turbine, OCGT = open cycle gas turbine. What do you think?

It looks like also there is need to check units for lines_capacity along with potential and avg_production_pu. Have you managed to find units for them? I do have some ideas, but it's better to find a source to be sure :)

javier-cp6 commented 1 year ago

Hi @ekatef , Thanks! Almost done! Indeed, installed capacity of generators is in MW, I think it is not necessary to explain CCGT and OCGT. I've also added the units for lines_capacity (MVA), potential (MW), and avg_production_pu (MWh), and incorporated links to PyPSA documentation explaining those units.

I still need to complete the following. Please let me know, if you have any hints:

ekatef commented 1 year ago

Hi @ekatef , Thanks! Almost done! Indeed, installed capacity of generators is in MW, I think it is not necessary to explain CCGT and OCGT. I've also added the units for lines_capacity (MVA), potential (MW), and avg_production_pu (MWh), and incorporated links to PyPSA documentation explaining those units.

I still need to complete the following. Please let me know, if you have any hints:

Hi @javier-cp6, nice to see the progress :) Great idea to incorporate the links into the documentation!

The remaining questions are quite advanced ones. There are some ideas:

  • Units for load in solve_network.

I suspect "load" in this context can really mean load shedding. Could you please check if it's the case in fact?

  • The value for gdp (USD) in 'build_shapes' seems too big.

According to the source article for GDP, the value correspond to "in constant 2011 international US dollars", while we are using 2020 to extract GDP. Inflation in Nigeria was about ~10% annually during 2011-2020 which seems to explain ~2.5 times difference we have.

  • Some mean_load values (CPU usage percentage of the total running time) are grater than 100%.

Is it probably a consequence of using a multi-core processor?

javier-cp6 commented 1 year ago

Hi @ekatef ,

Thanks for your explanations! I have finished updating the units and main descriptions for the stats in the notebook.

This time, I ran the run_all_scenarios rule, which automatically updates the configuration file for the Nigerias' test case and runs the make_statistics rule. I think the PR is ready for review.

ekatef commented 1 year ago

@javier-cp6 thanks for your amazing work, @davide-f thank you very much for the great review! Agree that we are close to finalise 🙂

Some comments after playing a bit with this PR:

  1. A generated stats table is a perfect way to obtain a big picture of the modeling. Really looking forward to have this PR merged!
  2. Davide's comments are crucial for usability: I needed some time to understand which folder is meant as root/pypsa-earth and failed to get run_all_scenarios work
  3. Our discussion seems to be a good base to document both make_statistics and run_all_scenarios. An issue has been added to that
davide-f commented 1 year ago

@javier-cp6 As a suggestion, for setting the parent folder, please use this code:

# change current directory to parent folder
import os
import sys

if not os.path.isdir("pypsa-earth"):
    os.chdir("../..")
sys.path.append(os.getcwd()+"/pypsa-earth/scripts")

That is resilient with respect to the name of the parent directory. That is compliant to the parallel PR that I opened to fix that annoying problem.

javier-cp6 commented 1 year ago

Hi @ekatef , @davide-f ,

Thank you for your review and pointing out a potential confusion. I have updated the description so the make_satistics rule can be used anytime and explained the output format (results/{run_name}/stats.csv). Additionally, I have added a note briefly explaining the run_all_scenarios rule., so both rules should be able to run to make the stats.

The given snippet for setting the parent folder is very useful indeed. Please, let me know if there are any additional tasks that need attention.

davide-f commented 1 year ago

Nice @javier-cp6 ! I slightly revise a text and ready to merge :) Many thanks, you are officially a contributor! :)