OSeMOSYS / otoole

OSeMOSYS Tools for Energy
https://otoole.readthedocs.io
MIT License
25 stars 19 forks source link

[Bug]: otoole.results calculation of CRF fails if no data in discount_rate parameter #217

Closed willu47 closed 2 months ago

willu47 commented 10 months ago

The Issue

Calculation of the capital recovery factor fails if no data is entered in the discount rate parameter file.

Expected Behavior

The capital recovery factor should be calculated using the default value for the discount factor, for any region/technology combination not defined in the discount_rate_idv parameter failed or region not defined in the discount_rate parameter.

Steps To Reproduce

This test reproduces the issue, with a new fixture representing an empty discount rate dataframe::

@fixture
def discount_rate_empty():
    df = pd.DataFrame(
        data=[],
        columns=["REGION", "VALUE"],
    ).set_index(["REGION"])
    return df

  def test_crf_empty_discount_rate(self, region, discount_rate_empty, operational_life):

      technologies = ["GAS_EXTRACTION", "DUMMY"]
      regions = region["VALUE"].to_list()
      actual = capital_recovery_factor(
          regions, technologies, discount_rate_empty, operational_life
      )

      expected = pd.DataFrame(
          data=[
              ["SIMPLICITY", "GAS_EXTRACTION", 0.5121951219512197],
              ["SIMPLICITY", "DUMMY", 0.34972244250594786],
          ],
          columns=["REGION", "TECHNOLOGY", "VALUE"],
      ).set_index(["REGION", "TECHNOLOGY"])

      assert_frame_equal(actual, expected)

Log output

src/otoole/results/result_package.py:798: in capital_recovery_factor
    crf[:] = values
../../miniconda3/envs/otoole/lib/python3.9/site-packages/pandas/core/frame.py:4074: in __setitem__
    return self._setitem_slice(slc, value)
../../miniconda3/envs/otoole/lib/python3.9/site-packages/pandas/core/frame.py:4098: in _setitem_slice
    self.iloc[key] = value
../../miniconda3/envs/otoole/lib/python3.9/site-packages/pandas/core/indexing.py:885: in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)
../../miniconda3/envs/otoole/lib/python3.9/site-packages/pandas/core/indexing.py:1895: in _setitem_with_indexer
    self._setitem_single_block(indexer, value, name)
../../miniconda3/envs/otoole/lib/python3.9/site-packages/pandas/core/indexing.py:2138: in _setitem_single_block
    self.obj._mgr = self.obj._mgr.setitem(indexer=indexer, value=value)
../../miniconda3/envs/otoole/lib/python3.9/site-packages/pandas/core/internals/managers.py:399: in setitem
    return self.apply("setitem", indexer=indexer, value=value)
../../miniconda3/envs/otoole/lib/python3.9/site-packages/pandas/core/internals/managers.py:354: in apply
    applied = getattr(b, f)(**kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = NumpyBlock: slice(0, 1, 1), 1 x 2, dtype: object
indexer = slice(None, None, None)
value = Series([], Name: VALUE, dtype: object), using_cow = False

    def setitem(self, indexer, value, using_cow: bool = False) -> Block:
        """
        Attempt self.values[indexer] = value, possibly creating a new array.

        Parameters
        ----------
        indexer : tuple, list-like, array-like, slice, int
            The subset of self.values to set
        value : object
            The value being set
        using_cow: bool, default False
            Signaling if CoW is used.

        Returns
        -------
        Block

        Notes
        -----
        `indexer` is a direct slice/positional indexer. `value` must
        be a compatible shape.
        """

        value = self._standardize_fill_value(value)

        values = cast(np.ndarray, self.values)
        if self.ndim == 2:
            values = values.T

        # length checking
        check_setitem_lengths(indexer, value, values)

        if self.dtype != _dtype_obj:
            # GH48933: extract_array would convert a pd.Series value to np.ndarray
            value = extract_array(value, extract_numpy=True)
        try:
            casted = np_can_hold_element(values.dtype, value)
        except LossySetitemError:
            # current dtype cannot store value, coerce to common dtype
            nb = self.coerce_to_target_dtype(value, warn_on_upcast=True)
            return nb.setitem(indexer, value)
        else:
            if self.dtype == _dtype_obj:
                # TODO: avoid having to construct values[indexer]
                vi = values[indexer]
                if lib.is_list_like(vi):
                    # checking lib.is_scalar here fails on
                    #  test_iloc_setitem_custom_object
                    casted = setitem_datetimelike_compat(values, len(vi), casted)

            self = self._maybe_copy(using_cow, inplace=True)
            values = cast(np.ndarray, self.values.T)
            if isinstance(casted, np.ndarray) and casted.ndim == 1 and len(casted) == 1:
                # NumPy 1.25 deprecation: https://github.com/numpy/numpy/pull/10615
                casted = casted[0, ...]
>           values[indexer] = casted
E           ValueError: could not broadcast input array from shape (0,) into shape (2,1)

Operating System

MacOS

What version of otoole are you running?

1.1.2.post1.dev15+g15bdc0b.d20240201

Possible Solution

Yes, we'll need to always ensure that the discount_rate or discount_rate_idv parameter is padded out with default values before passing it to the crf calculation.

To do this, I can use @trevorb1's _expand_default() function which is currently a method on the WriteStrategy class. This could instead be a method on an InputData class, which would provides either a view of the data which includes default_values, or a compact view excluding value which are not explicitly provided in the input data. This could be reused or applied as required throughout the code base, minimising memory use.

Anything else?

No response