OpenEnergyPlatform / oekg-builder

1 stars 0 forks source link

Alignment of the abbreviated column names or meta information of the scenario datasets with the OEO terms. #8

Open adelmemariani opened 2 years ago

adelmemariani commented 2 years ago

As an example, this scenario dataset has meta information and in the 'resources' -> 'fileds' there are some descriptions about the columns. However, typically, these descriptions are not sufficient for inferring an OEO concept for the column.

As of 'May 2nd, 2022', there are 238 unique column names in total (all uploaded scenario datasets). Some of them can be mapped easily to the OEO concepts:

# column name OEO concept OEO class ID
1 id unique individual identifier OEO_00010037
2 scenario scenario OEO_00000364
3 constr constraint OEO_00000104
4 val
5 region spatial region BFO_0000006
6 sector sector OEO_00000367
7 type
8 demand demand OEO_00140040
9 energy energy OEO_00000150
10 reg
11 nutsid
12 geom
13 technology technology OEO_00000407
14 co2_var
15 co2_fix
16 eta_elec
17 eta_th
18 eta_el_chp
19 eta_th_chp
20 eta_chp_flex_el
21 sigma_chp
22 beta_chp
23 opex_var
24 opex_fix
25 capex
26 c_rate_in
27 c_rate_out
28 eta_in
29 eta_out
30 cap_loss
31 lifetime
32 wacc
33 ressource
34 transformer transformer OEO_00000420
35 power power OEO_00000333
36 from_region
37 to_region
38 capacity
39 parameter
40 value
41 unit unit of measurement UO_0000000
42 country_code
43 category
44 year year UO_0000036
45 gas
46 notation
47 ry
48 base_year
49 default_unit
50 additional_unit
51 source_original
52 comment
53 value_reported
54 value_ry_calibration
55 value_gapfilled
56 submission_year_original
57 subtable
58 data_source_years
59 data_source
60 is_part_of_projections
61 is_ry
62 category_code
63 category_lulucf
64 category_parent
65 level
66 is_user_defined
67 in_cir_2020_1208_annex_xxv_art_38_tab_1a
68 in_cir_749_2014_annex_xii_art_23_tab_1
69 crf_code
70 oeo_id
71 country_name
72 notation_name
73 detail
74 scenario_name
75 energy_carrier energy carrier OEO_00020039
76 avg_value
77 hourly_value
78 Szenario scenario OEO_00000364
79 Szenariojahr scenario year OEO_00020097
80 Region spatial region BFO_0000006
81 Kennwert
82 Unterkennwert
83 Sektor sector OEO_00000367
84 Energieträger
85 Technologie technology OEO_00000407
86 Einheit
87 Wert
88 Kommentar
89 Zeit
90 Bivalente_Luftwärmepumpe
91 Hybride_Luftwärmepumpe
92 Sondenwärmepumpe
93 Ländername
94 Erzeugung_Braunkohle_Kraft_Waerme_Kopplung
95 Erzeugung_Steinkohle_Kraft_Waerme_Kopplung
96 Erzeugung_Erdgas_Kraft_Waerme_Kopplung
97 Erzeugung_Öl_Kraft_Waerme_Kopplung
98 Erzeugung_Braunkohle_Kondensationskraftwerke
99 Erzeugung_Steinkohle_Kondensationskraftwerke
100 Erzeugung_Erdgas_Kondensationskraftwerke
101 Erzeugung_Öl_Kondensationskraftwerke
102 Erzeugung_Uran_Kondensationskraftwerke
103 Erzeugung_Wind_Onshore
104 Erzeugung_Wind_Offshore
105 Erzeugung_Photovoltaik
106 Erzeugung_Laufwasser
107 Erzeugung_Pumpspeicher
108 Erzeugung_Batterien
109 Verbrauch_Herkoemmlich_inkl_Klimatisierung
110 Verbrauch_Pumpspeicher
111 Verbrauch_Batterien
112 Verbrauch_Wärmepumpen
113 Verbrauch_Heizstäbe
114 Verbrauch_Power_to_Gas
115 Verbrauch_Batterie_elektrische_Fahrzeuge
116 Verbrauch_Plug_in_Hybride
117 Verbrauch_Range_Extender
118 Import
119 Export
120 Abregelung_Wind_Onshore
121 Abregelung_Wind_Offshore
122 Abregelung_Photovoltaik
123 variable variable OEO_00000435
124 source_category
125 crf
126 application
127 target_fulfilled
128 fuel fuel OEO_00000173
129 subsector
130 emission_source
131 greenhouse_gas greenhouse gas OEO_00000020
132 technologie
133 jahr year UO_0000036
134 at
135 be
136 ch
137 cz
138 dk
139 fr
140 gb
141 lu
142 nl
143 no
144 pl
145 se
146 bundesland
147 szenario scenario OEO_00000364
148 braunkohle
149 erdgas
150 kuppelgas
151 oel
152 abfall
153 sonstige_konventionelle
154 kwk_kleiner_10mw
155 pumpspeicher
156 lauf_und_wasserspeicher
157 wind_onshore onshore wind farm OEO_00000311
158 wind_offshore offshore wind farm OEO_00000308
159 photovoltaik
160 biomasse biomass OEO_00010214
161 sonstige_ee
162 band_des_stromverbrauchs_von
163 band_des_stromverbrauchs_bis
164 dsm
165 power_to_heat
166 power_to_gas
167 kategorie
168 energietraeger
169 referenz_2019
170 a_2035
171 b_2035
172 c_2035
173 b_2040
174 stunde
175 mittelwert
176 minimum
177 maximum
178 zeit
179 ungesteuerter_lastgang
180 optimierter_lastgang
181 konventioneller_stromverbrauch
182 elektromoblitaet
183 power_to_heat_haushalte
184 grossverbraucher
185 power_to_heat_industrie
186 sektor sector OEO_00000367
187 referenz_2018
188 name
189 carrier
190 tech
191 from_bus
192 to_bus
193 capacity_cost
194 efficiency
195 carrier_cost
196 marginal_cost
197 expandable
198 output_parameters
199 balanced
200 bus
201 amount
202 profile
203 timeindex
204 be_electricity_demand_profile
205 bb_electricity_demand_profile
206 storage_capacity
207 loss_rate
208 storage_capacity_cost
209 input_parameters
210 loss
211 be_solar_pv_profile
212 bb_solar_pv_profile
213 be_wind_onshore_profile
214 bb_wind_onshore_profile
215 dfid
216 nid
217 pathway
218 framework
219 version
220 region_2
221 indicator
222 aggregation
223 tags
224 updated
225 schema
226 field
227 source
228 month month UO_0000035
229 year_month
230 rid
231 set
232 internal_id
233 member_state
234 submission_year
235 crf_sector
236 additional_unit_information
237 notation_key
238 is_baseyear
adelmemariani commented 2 years ago

In principle, if the institutions provide enough textual descriptions for the column names of their datasets (and also the distinct values in the categorical columns), then it is possible to automatically recommend to them some candidate OEO concepts: A recommender engine (inside the OEP) for the alignments between the datasets' columns and the OEO terms. I already made a prototype, however, the performance of the final product depends on the quality of the descriptions.