openego / powerd-data

GNU Affero General Public License v3.0
1 stars 0 forks source link

finish-eGon100RE-fix-cts-buildings #244

Open CarlosEpia opened 1 month ago

CarlosEpia commented 1 month ago

The task electricity_demand_timeseries.cts_buildings.cts-buildings is failing with the error message:

Found local files: * /home/powerd/egon100re-run/airflow/logs/dag_id=egon-data-processing-pipeline/run_id=manual2024-05-03T07:42:22.499342+00:00/task_id=electricity_demand_timeseries.cts_buildings.cts-buildings/attempt=2.log [2024-05-13, 06:40:40 UTC] {taskinstance.py:1159} INFO - Dependencies all met for dep_context=non-requeueable deps ti=<TaskInstance: egon-data-processing-pipeline.electricity_demand_timeseries.cts_buildings.cts-buildings manual2024-05-03T07:42:22.499342+00:00 [queued]> [2024-05-13, 06:40:40 UTC] {taskinstance.py:1159} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: egon-data-processing-pipeline.electricity_demand_timeseries.cts_buildings.cts-buildings manual2024-05-03T07:42:22.499342+00:00 [queued]> [2024-05-13, 06:40:40 UTC] {taskinstance.py:1361} INFO - Starting attempt 2 of 2 [2024-05-13, 06:40:40 UTC] {taskinstance.py:1382} INFO - Executing <Task(CtsDemandBuildings (versioned)): electricity_demand_timeseries.cts_buildings.cts-buildings> on 2024-05-03 07:42:22.499342+00:00 [2024-05-13, 06:40:40 UTC] {standard_task_runner.py:57} INFO - Started process 1751857 to run task [2024-05-13, 06:40:40 UTC] {standard_task_runner.py:84} INFO - Running: ['airflow', 'tasks', 'run', 'egon-***-processing-pipeline', 'electricity_demand_timeseries.cts_buildings.cts-buildings', 'manual2024-05-03T07:42:22.499342+00:00', '--job-id', '232', '--raw', '--subdir', 'DAGS_FOLDER/dags/pipeline.py', '--cfg-path', '/tmp/tmpyqltlk1z'] [2024-05-13, 06:40:40 UTC] {standard_task_runner.py:85} INFO - Job 232: Subtask electricity_demand_timeseries.cts_buildings.cts-buildings [2024-05-13, 06:40:40 UTC] {task_command.py:416} INFO - Running <TaskInstance: egon-data-processing-pipeline.electricity_demand_timeseries.cts_buildings.cts-buildings manual__2024-05-03T07:42:22.499342+00:00 [running]> on host at31 [2024-05-13, 06:40:40 UTC] {taskinstance.py:1662} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='airflowstatsd_on = False' AIRFLOW_CTX_DAG_ID='egon--processing-pipeline' AIRFLOW_CTX_TASK_ID='electricity_demand_timeseries.cts_buildings.cts-buildings' AIRFLOW_CTX_EXECUTION_DATE='2024-05-03T07:42:22.499342+00:00' AIRFLOW_CTX_TRY_NUMBER='2' AIRFLOW_CTX_DAG_RUN_ID='manual__2024-05-03T07:42:22.499342+00:00' [2024-05-13, 06:40:40 UTC] {logging_mixin.py:154} WARNING - 2024-05-13 08:40:40.696 | INFO | egon..sets.electricity_demand_timeseries.cts_buildings:cts_buildings:1211 Start logging! [2024-05-13, 06:41:45 UTC] {logging_mixin.py:154} WARNING - 2024-05-13 08:41:45.182 | INFO | egon..sets.electricity_demand_timeseries.cts_buildings:cts_buildings:1214 Buildings with amenities selected! [2024-05-13, 06:41:45 UTC] {taskinstance.py:1937} ERROR - Task failed with exception Traceback (most recent call last): File "/home/powerd/egon100re-run/powerd-data/src/egon/data/datasets/init.py", line 216, in skip_task result = super(type(task), task).execute(xs, ks) File "/home/powerd/egon100re-run/venv/lib/python3.8/site-packages/airflow/operators/python.py", line 192, in execute return_value = self.execute_callable() File "/home/powerd/egon100re-run/venv/lib/python3.8/site-packages/airflow/operators/python.py", line 209, in execute_callable return self.python_callable(self.op_args, self.op_kwargs) File "/home/powerd/egon100re-run/powerd-data/src/egon/data/datasets/electricity_demand_timeseries/cts_buildings.py", line 1217, in cts_buildings median_n_amenities = int( ValueError: cannot convert float NaN to integer [2024-05-13, 06:41:45 UTC] {taskinstance.py:1400} INFO - Marking task as FAILED. dag_id=egon--processing-pipeline, task_id=electricity_demand_timeseries.cts_buildings.cts-buildings, execution_date=20240503T074222, start_date=20240513T064040, end_date=20240513T064145 [2024-05-13, 06:41:45 UTC] {standard_task_runner.py:104} ERROR - Failed to execute job 232 for task electricity_demand_timeseries.cts_buildings.cts-buildings (cannot convert float NaN to integer; 1751857) [2024-05-13, 06:41:45 UTC] {local_task_job_runner.py:228} INFO - Task exited with return code 1 [2024-05-13, 06:41:45 UTC] {logging_mixin.py:154} WARNING - /home/powerd/egon100re-run/venv/lib/python3.8/site-packages/airflow/models/baseoperator.py:1203 AirflowProviderDeprecationWarning: Call to deprecated class PostgresOperator. (Please use airflow.providers.common.sql.operators.sql.SQLExecuteQueryOperator.Also, you can provide hook_params={'schema': <***base>}.) [2024-05-13, 06:41:45 UTC] {taskinstance.py:2778} INFO - 0 downstream tasks scheduled from follow-on schedule check

The problem is related to a function looking explicitly for status2019 scenario data.