openego / powerd-data

GNU Affero General Public License v3.0
1 stars 0 forks source link

electricity_demand_timeseries.hh_profiles.houseprofiles-in-census-cells fails in current run for entire Germany #43

Closed ulfmueller closed 1 year ago

ulfmueller commented 1 year ago

the log:

[2023-05-16 13:19:12,627] {taskinstance.py:901} INFO - Executing <Task(Household Demands (versioned)): electricity_demand_timeseries.hh_profiles.houseprofiles-in-census-cells> on 2023-05-15T09:30:48.901072+00:00 [2023-05-16 13:19:12,637] {standard_task_runner.py:54} INFO - Started process 49069 to run task [2023-05-16 13:19:12,729] {standard_task_runner.py:77} INFO - Running: ['airflow', 'run', 'powerd-status-quo-processing-pipeline', 'electricity_demand_timeseries.hh_profiles.houseprofiles-in-census-cells', '2023-05-15T09:30:48.901072+00:00', '--job_id', '179', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/dags/pipeline_status_quo.py', '--cfg_path', '/tmp/tmpfh96ja1q'] [2023-05-16 13:19:12,731] {standard_task_runner.py:78} INFO - Job 179: Subtask electricity_demand_timeseries.hh_profiles.houseprofiles-in-census-cells [2023-05-16 13:19:12,781] {logging_mixin.py:120} INFO - Running <TaskInstance: powerd-status-quo-processing-pipeline.electricity_demand_timeseries.hh_profiles.houseprofiles-in-census-cells 2023-05-15T09:30:48.901072+00:00 [running]> on host at36 [2023-05-16 16:45:56,443] {taskinstance.py:1150} ERROR - (2050, 'DE111') Traceback (most recent call last): File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 2131, in pandas._libs.hashtable.Int64HashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 2140, in pandas._libs.hashtable.Int64HashTable.get_item KeyError: 2050

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "pandas/_libs/index.pyx", line 717, in pandas._libs.index.BaseMultiIndexCodesEngine.get_loc File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc raise KeyError(key) from err KeyError: 2050

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexes/multi.py", line 3080, in _get_loc_level return (self._engine.get_loc(key), None) File "pandas/_libs/index.pyx", line 720, in pandas._libs.index.BaseMultiIndexCodesEngine.get_loc KeyError: (2050, 'DE111')

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 984, in _run_raw_task result = task_copy.execute(context=context) File "/home/powerd/powerd-data/powerd-data/src/egon/data/datasets/init.py", line 194, in skip_task result = super(type(task), task).execute(*xs, *ks) File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/airflow/operators/python_operator.py", line 113, in execute return_value = self.execute_callable() File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/airflow/operators/python_operator.py", line 118, in execute_callable return self.python_callable(self.op_args, **self.op_kwargs) File "/home/powerd/powerd-data/powerd-data/src/egon/data/datasets/electricity_demand_timeseries/hh_profiles.py", line 1541, in houseprofiles_in_census_cells df_hh_profiles_in_census_cells = adjust_to_demand_regio_nuts3_annual( File "/home/powerd/powerd-data/powerd-data/src/egon/data/datasets/electricity_demand_timeseries/hh_profiles.py", line 1351, in adjust_to_demand_regio_nuts3_annual df_demand_regio.loc[(2050, nuts3_id), "demand_mwha"] File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexing.py", line 925, in getitem return self._getitem_tuple(key) File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexing.py", line 1100, in _getitem_tuple return self._getitem_lowerdim(tup) File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexing.py", line 822, in _getitem_lowerdim return self._getitem_nested_tuple(tup) File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexing.py", line 906, in _getitem_nested_tuple obj = getattr(obj, self.name)._getitem_axis(key, axis=axis) File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexing.py", line 1164, in _getitem_axis return self._get_label(key, axis=axis) File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexing.py", line 1113, in _get_label return self.obj.xs(label, axis=axis) File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/generic.py", line 3770, in xs loc, new_index = index._get_loc_level( File "/home/powerd/powerd-data/venv/lib/python3.8/site-packages/pandas/core/indexes/multi.py", line 3082, in _get_loc_level raise KeyError(key) from e KeyError: (2050, 'DE111') [2023-05-16 16:45:56,484] {taskinstance.py:1187} INFO - Marking task as FAILED. dag_id=powerd-status-quo-processing-pipeline, task_id=electricity_demand_timeseries.hh_profiles.houseprofiles-in-census-cells, execution_date=20230515T093048, start_date=20230516T111912, end_date=20230516T144556 [2023-05-16 16:45:56,607] {local_task_job.py:156} WARNING - State of this instance has been externally set to failed. Taking the poison pill. [2023-05-16 16:45:56,628] {helpers.py:325} INFO - Sending Signals.SIGTERM to GPID 49069 [2023-05-16 16:45:57,642] {helpers.py:291} INFO - Process psutil.Process(pid=49069, status='terminated', exitcode=1, started='13:19:12') (49069) terminated with exit code 1 [2023-05-16 16:45:57,643] {local_task_job.py:102} INFO - Task exited with return code 1

ClaraBuettner commented 1 year ago

I guess that is related to the partial use of the "scenarios"-parameter. Data from demandregio is not selected for eGon100RE but this dataset still wants to create data for all scenarios. I could try to fix it manually on the server or we have to wait until #39 is done. What do you prefer?

ulfmueller commented 1 year ago

hot fix: :-1: cold fix: :+1:

ClaraBuettner commented 1 year ago

Fixed in #49