OSeMOSYS / otoole

OSeMOSYS Tools for Energy
https://otoole.readthedocs.io
MIT License
23 stars 17 forks source link

Fix writing of default values #216

Open trevorb1 opened 7 months ago

trevorb1 commented 7 months ago

Description

In this PR I have fixed the writing out of default values through the existing --write_defaults flag. Accompanying tests have been added.

This functionality existed (and worked) for writing out parameter defaults, but did not for writing out result defaults. This was due to the input data not being passed into the WriteStrategy._expand_defaults(...) function. I have addressed this, however, naming of variables is not super clear now in the WriteStrategy class.

Within WriteStrategy, data to be written out is now called inputs (this can be set/param data or result data), while input data for the model (only param/set data) is now called input_data. idk, we may want to clarify this before merged, or it may just be fine since its only in one place and documented in the docstring of _expand_defaults(...) and at variable definition.

Issue Ticket Number

My suggestion is for this PR to replace PR #215

Documentation

The functionality was already documented, but it didnt work!

willu47 commented 6 months ago

@trevorb1 - we have a performance problem. Creating new pandas dataframes with default values, concatenating the read-in values, and deleting duplicate values is slow and memory intensive. We need to consider alternatives.

  1. Do not store the entire dictionary of expanded dataframes in memory, but only generate them one-at-a-time upon writing out.
  2. Use xarray, which is a much more efficient way of storing data in-memory; and with very fast and easy methods to expand or fill multi-dimensional data. It also provides very easy conversion to and from pandas DataFrames e.g. see #184
willu47 commented 6 months ago

@trevorb1 see 7849353 which refactors your expand_defaults code, ensuring that only one dataframe at a time is expanded and then immediately written out, minimising memory use. Note this breaks a lot of tests...

trevorb1 commented 6 months ago

Thanks for the feedback, @willu47! I looked at this today and think your option 2 of using xarray will be the way forward.

I refactored the current code a little, just to make use of instance variables. My first thought of using pandas update actually made it slower, though. So that was a bummer haha. The implementation you suggested, @willu47, was faster than the current code!

image

This method still wont work with bigger models, though :( I was using OSeMSOYS Global to generate generic input data for these tests. What I found was that anything past a medium size model, and my computer just killed it after like 45sec (I have a mid-range laptop). imo, this makes the --write_defaults flag not that useful, as for more realistic models it probably wont work. (However, to be fair, I probably wouldnt run a large model on my laptop)

I will look at the xarray implementation to see if that works better! Thanks for the idea on that!

trevorb1 commented 5 months ago

With the latest updates I have (following @willu47's suggestion here):

Still to do before reviewing/merging this:

I guess the scope of this PR has changed from "fixing" writing defaults to just refactoring it in otoole to address issue #217. Based on my comment above, seems like the actual fixing will really only come with the xarray implementation, as mentioned by @willu47 here (which relates to the open PR #184)

trevorb1 commented 5 months ago

Okay, I think writing defaults has been moved to ReadStrategy and works now!

A few notes:

HauHe commented 4 months ago

Hi @trevorb1, I just ran a test with Utopia without having the DiscountRateIdv in the config file, but the default in the model file pointing to the DiscountRate I get results for CapitalInvestment. I haven't done any other testing (yet). But I'd say this should fix #220 Thanks a lot for your work!!

trevorb1 commented 4 months ago

@HauHe Great! To confirm, your test was for a multi-year model, right? I cant remember if another issue arose for a single year model that relates to this. But maybe since that is more of an edge case, deserves its own issue ticket

HauHe commented 4 months ago

@trevorb1, yes the model I used for testing (UTOPIA) has multiple years.

robertodawid commented 1 month ago

I ran a test with one of my models, and there is no data in the Capitalnvestment.csv. In this case, I have declared the DiscountRateIdv for one of the technologies. No matter the combination I did, for example, defaults to DiscountRate in the model file or the default value in the data file, there are no results for capital investment. We have similar difficulties (Shravan and I) in calculating the CapitalInvestment in Osemosys_Pul.p. However, we solved the issue with Sharavan much simpler; and maybe it can be simplified/enhanced. Happy to share it with you!. @trevorb1

robertodawid commented 4 weeks ago

I ran a test with one of my models, and there is no data in the Capitalnvestment.csv. In this case, I have declared the DiscountRateIdv for one of the technologies. No matter the combination I did, for example, defaults to DiscountRate in the model file or the default value in the data file, there are no results for capital investment. We have similar difficulties (Shravan and I) in calculating the CapitalInvestment in Osemosys_Pul.p. However, we solved the issue with Sharavan much simpler; maybe it can be simplified/enhanced. Happy to share it with you! @trevorb1

@trevorb1, thanks for yesterday's discussion. I tried to install the version from this branch without success; the CapitalInvestment was empty. The DiscountRateIdv in the data file looks like this (using otoole):

param default 0.089 : DiscountRate :=
RE1 0.05
;
param default 0.089 : DiscountRateIdv :=
RE1 CKGELCurb 0.02
; 

For now, the easiest solution is to un-hash CC1 and the var CapitalInvestment, and the calculated investments are ok compared to my manual calculation.

trevorb1 commented 4 weeks ago

Thanks for the the update @robertodawid! Do you have a minimal datafile + model you would be able to attach that I would be able to recreate this issue with?