Closed javier-cp6 closed 1 year ago
Hi @javier-cp6, your contribution is absolutely welcome! It would help a lot to have a demo on make_stats
.
Adding to @pz-max suggestions regarding supplementing the values in the table with units, I think it would be great to play a bit around output formats, in particular:
1) keep a number of digits reasonable: currently our stats claims to know transmission lines length with nanometers precision with clean_osm_data.3
;
2) find a way to deal with big numbers like those for build_shapes
.
What do you think?
Sure. I think I can also group the rules.
Hello Javier :) Many thanks for the PR and sorry for the delay but Max and Katia have covered the comments. Personally, I really like the PR and I think little updates may be done to improve it. In particular:
read_csv_nafix(str(stats_path), header=[0, 1], index_col=0)
using the read_csv_nafix from the helpers.
Note that this is to prevent problems with namibia whose country code is "NA" and that may be interpreted as nan. If you don't want to import that function ok, but you shall use pd.read_csv(..., na_values=["NULL"])
substations-no
instead of substations-size
[not sure where those comments have moved to]Hi @pz-max @ekatef @davide-f ,
Thanks for your hints. So far, I’ve updated the notebook with the following changes:
I need some help with the units for the stats. The rule does not generate them, so I'm looking in the make_statistics script and the resources folder generated for NG. However, I still need to complete the units for the remaining fields. Once I have completed this task, I will update the naming accordingly and add some cell descriptions.
Hi @javier-cp6, a very nice work! Sorry for a delay with the answer: a problem of chained deadlines 🙂
My general feeling is the output looks much more clear now.
It seems lines_length
for some reasons keeps nine decimal points. Could you please check it?
Regarding adding the units, I'd be happy to help. Agree, that in some cases in could be currently a bit tricky to dig-out the dimensions from the code and docs :) So, investigation on that and adding units to the outputs is definitely worth efforts. Could you please give some updates on which parameters do you need help to find their units?
Hi @ekatef ,
Thanks for your response. I've updated the notebook adding more units and descriptions. I've also fixed the lines_length
format.
I'll keep searching for the rest of units, could you please help me with add_electricty
stats such as OCGT
and CCGT
?
Hi @javier-cp6, super! Being formatted in this way, the notebook is very handy. My feeling is that your PR is close to be completed.
Values of CCGT:hydro
under add_electricity
imply installed capacity in MW. Probably, it could also make sense to explain that CCGT = combined cycle gas turbine, OCGT = open cycle gas turbine. What do you think?
It looks like also there is need to check units for lines_capacity
along with potential
and avg_production_pu
. Have you managed to find units for them? I do have some ideas, but it's better to find a source to be sure :)
Hi @ekatef ,
Thanks! Almost done! Indeed, installed capacity of generators is in MW, I think it is not necessary to explain CCGT and OCGT. I've also added the units for lines_capacity
(MVA), potential
(MW), and avg_production_pu
(MWh), and incorporated links to PyPSA documentation explaining those units.
I still need to complete the following. Please let me know, if you have any hints:
load
in solve_network
.gdp
(USD) in 'build_shapes' seems too big.mean_load
values (CPU usage percentage of the total running time) are grater than 100%.Hi @ekatef , Thanks! Almost done! Indeed, installed capacity of generators is in MW, I think it is not necessary to explain CCGT and OCGT. I've also added the units for
lines_capacity
(MVA),potential
(MW), andavg_production_pu
(MWh), and incorporated links to PyPSA documentation explaining those units.I still need to complete the following. Please let me know, if you have any hints:
Hi @javier-cp6, nice to see the progress :) Great idea to incorporate the links into the documentation!
The remaining questions are quite advanced ones. There are some ideas:
- Units for
load
insolve_network
.
I suspect "load" in this context can really mean load shedding. Could you please check if it's the case in fact?
- The value for
gdp
(USD) in 'build_shapes' seems too big.
According to the source article for GDP, the value correspond to "in constant 2011 international US dollars", while we are using 2020 to extract GDP. Inflation in Nigeria was about ~10% annually during 2011-2020 which seems to explain ~2.5 times difference we have.
- Some
mean_load
values (CPU usage percentage of the total running time) are grater than 100%.
Is it probably a consequence of using a multi-core processor?
Hi @ekatef ,
Thanks for your explanations! I have finished updating the units and main descriptions for the stats in the notebook.
This time, I ran the run_all_scenarios
rule, which automatically updates the configuration file for the Nigerias' test case and runs the make_statistics
rule. I think the PR is ready for review.
@javier-cp6 thanks for your amazing work, @davide-f thank you very much for the great review! Agree that we are close to finalise 🙂
Some comments after playing a bit with this PR:
root/pypsa-earth
and failed to get run_all_scenarios
workmake_statistics
and run_all_scenarios
. An issue has been added to that@javier-cp6 As a suggestion, for setting the parent folder, please use this code:
# change current directory to parent folder
import os
import sys
if not os.path.isdir("pypsa-earth"):
os.chdir("../..")
sys.path.append(os.getcwd()+"/pypsa-earth/scripts")
That is resilient with respect to the name of the parent directory. That is compliant to the parallel PR that I opened to fix that annoying problem.
Hi @ekatef , @davide-f ,
Thank you for your review and pointing out a potential confusion. I have updated the description so the make_satistics
rule can be used anytime and explained the output format (results/{run_name}/stats.csv
). Additionally, I have added a note briefly explaining the run_all_scenarios
rule., so both rules should be able to run to make the stats.
The given snippet for setting the parent folder is very useful indeed. Please, let me know if there are any additional tasks that need attention.
Nice @javier-cp6 ! I slightly revise a text and ready to merge :) Many thanks, you are officially a contributor! :)
Closes #46, closes pypsa-meets-earth/pypsa-earth#608 Added a notebook that explains how to create statistics on the workflow by running the make_statistics rule.