calliope-project / calliope

A multi-scale energy systems modelling framework
https://www.callio.pe
Apache License 2.0
287 stars 93 forks source link

nodes dimension missing from input CSV files if parameters set at tech level #667

Open jmorrisnrel opened 3 weeks ago

jmorrisnrel commented 3 weeks ago

What happened?

This might not be a bug but we see some inconsistent columns/dimensions in the inputs files generated by to_csv(). If we set a particular parameter at the tech level for every tech in the model the resulting inputs_*variable*.csv does not have a nodes dimension/column.

Below is an example where we configure all of the parameters at the tech level without any node-level overrides

tech_groups:
  Transmission:
    base_tech: transmission
    carrier_in: Power
    carrier_out: Power
    name: Transmission
techs:
  Battery:
    base_tech: storage
    carrier_in: Power
    carrier_out: Power
    cost_interest_rate:
      data: 10.0
      dims: costs
      index: monetary
    flow_cap_max: 205000
    lifetime: 1
    name: Battery
  CCGT:
    base_tech: supply
    carrier_out: Power
    cost_flow_cap:
      data: 20
      dims: costs
      index: monetary
    cost_flow_out:
      data: 0.05
      dims: costs
      index: monetary
    flow_cap_max: 150000
    lifetime: 1
    name: CCGT
  Demand:
    base_tech: demand
    carrier_in: Power
    name: Demand
  region_1_region_2_Transmission:
    from: region_1
    inherit: Transmission
    to: region_2
nodes:
  region_1:
    latitude: 40.0
    longitude: -2.0
    techs:
      Battery: null
      CCGT: null
  region_2:
    latitude: 40.0
    longitude: -8.0
    techs:
      Demand: null

The resulting inputs CSVs (inputs_flow_cap_max.csv below) have no node column.

techs,flow_cap_max
Battery,205000.0
CCGT,150000.0

Configuring one node level override (adding a flow_cap_max to the CCGT in region_1) changes inputs_flow_cap_max.csv to this:

nodes,techs,flow_cap_max
region_1,Battery,205000.0
region_1,CCGT,149000.0

This makes automating the processing of results inconsistent. Our specific use case is determining the system-wide maximum capacity constraints to compare against the built capacity in visualizations. In v0.6 the locs/nodes dimension was always present in the inputs files so we could easily total up the max capacity without additional work counting the number of nodes. This might not be a bug and instead be the result of the new design of problem formulation in 0.7 but wanted to flag this inconsistency.

Which operating systems have you used?

Version

v0.7.0-dev3

Relevant log output

No response

brynpickering commented 2 weeks ago

We keep the dimensions over which parameters are defined to a minimum to reduce the model size in memory. This means we no longer presuppose the dimensions over which a parameter should be defined (this was hardcoded in v0.6, but then made it impossible to work with user-defined parameters in a consistent manner). Parameters are instead broadcast over different dimensions when the optimisation problem is built.

This means that if you only define a parameter at the tech level, it will exist in the model inputs only over that dimension.

I can see why the specific case you mention could be problematic. I'd say the most straightforward would be to always broadcast an array over the definition_matrix, which will account for whether a tech is even defined at a specific node. The easiest way to do this is to work with the data in xarray (save to netcdf) rather than pandas (save to csv).