CartoDB / bigmetadata

BSD 3-Clause "New" or "Revised" License
43 stars 11 forks source link

Canada 2016 fails #597

Closed javitonino closed 5 years ago

javitonino commented 5 years ago
2018-11-05 23:15:48,204 [ERROR]: [pid 575] Worker Worker(salt=249140680, workers=1, host=37c513a3ad3d, username=root, pid=575) failed    tasks.ca.statcan.census2016.data.ImportData(resolution=ct_, topic=t002, segment=all)
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1184, in _execute_context
    context)
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/default.py", line 462, in do_execute
    cursor.execute(statement, parameters)
psycopg2.ProgrammingError: function crosstab(unknown) does not exist
LINE 4:                FROM crosstab('SELECT geocode, profileid, tot...
                            ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/luigi/worker.py", line 203, in run
    new_deps = self._run_get_new_deps()
  File "/usr/local/lib/python3.5/dist-packages/luigi/worker.py", line 140, in _run_get_new_deps
    task_gen = self.task.run()
  File "/bigmetadata/tasks/ca/statcan/census2016/data.py", line 201, in run
    self._populate_from_copy()
  File "/bigmetadata/tasks/ca/statcan/census2016/data.py", line 193, in _populate_from_copy
    self._upsert_data(t_ids, 't')
  File "/bigmetadata/tasks/ca/statcan/census2016/data.py", line 176, in _upsert_data
    session.execute(stmt)
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/orm/session.py", line 1044, in execute
    bind, close_with_result=True).execute(clause, params or {})
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 947, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/sql/elements.py", line 262, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1055, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1191, in _execute_context
    context)
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1386, in _handle_dbapi_exception
    exc_info
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/util/compat.py", line 185, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1184, in _execute_context
    context)
  File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/default.py", line 462, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) function crosstab(unknown) does not exist
LINE 4:                FROM crosstab('SELECT geocode, profileid, tot...
                            ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
javitonino commented 5 years ago

Retrying after installing tablefunc extension manually, let's see.

javitonino commented 5 years ago

Now it fails due to NULLs: - 1 tasks.ca.statcan.census2016.data.CensusData(resolution=da_, topic=t006, segment=all)

2018-11-06 16:18:40,986 [ERROR]: [pid 104] Worker Worker(salt=857882226, workers=1, host=f5f4362acb91, username=root, pid=21) failed    tasks.ca.statcan.census2016.data.CensusData(resolution=da_, topic=t006, segment=all)
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/luigi/worker.py", line 203, in run
    new_deps = self._run_get_new_deps()
  File "/usr/local/lib/python3.5/dist-packages/luigi/worker.py", line 140, in _run_get_new_deps
    task_gen = self.task.run()
  File "/bigmetadata/tasks/base_tasks.py", line 941, in run
    self.check_null_columns()
  File "/bigmetadata/tasks/base_tasks.py", line 1013, in check_null_columns
    table=self.output().table, columns=', '.join([x[0] for x in result])))
ValueError: The following columns of the table "observatory.obs_4b70f1f3f1bc8a6f7390ee58309949df402b47c8" contain only NULL values: c0661_f, c0661_m, c0662_t, c0662_f, c0662_m, c0663_t, c0663_f, c0663_m, c0664_f, c0664_m, c0665_f, c0665_m, c0666_f, c0666_m, c0667_f, c0667_m, c0703_f, c0703_m, c0704_f, c0704_m, c0705_f, c0705_m, c0706_f, c0706_m, c0707_f, c0707_m, c0708_f, c0708_m, c0709_f, c0709_m, c0710_f, c0710_m, c0712_f, c0712_m, c0713_f, c0713_m, c0714_f, c0714_m, c0715_f, c0715_m, c0716_f, c0716_m, c0717_f, c0717_m, c0718_f, c0718_m, c0719_f, c0719_m, c0720_f, c0720_m, c0721_f, c0721_m, c0732_f, c0732_m, c0733_f, c0733_m, c0734_f, c0734_m, c0735_f, c0735_m, c0736_f, c0736_m, c0737_f, c0737_m, c0738_f, c0738_m, c0739_f, c0739_m, c0740_f, c0740_m, c0828_f, c0828_m, c0829_f, c0829_m, c0830_f, c0830_m, c0831_f, c0831_m, c0832_f, c0832_m, c0833_f, c0833_m, c0834_f, c0834_m, c0835_f, c0835_m, c0836_f, c0836_m, c0837_f, c0837_m, c0838_f, c0838_m, c0839_f, c0839_m, c0840_f, c0840_m, c0841_f, c0841_m, c0843_f, c0843_m, c0844_f, c0844_m, c0845_f, c0845_m, c0846_f, c0846_m, c0847_f, c0847_m, c0848_f, c0848_m, c0849_f, c0849_m, c0850_f, c0850_m
2018-11-06 16:18:40,988 [INFO]: rollback tasks.ca.statcan.census2016.data.CensusData_da__all_t006_1138ddeb5e: The following columns of the table "observatory.obs_4b70f1f3f1bc8a6f7390ee58309949df402b47c8" contain only NULL values: c0661_f, c0661_m, c0662_t, c0662_f, c0662_m, c0663_t, c0663_f, c0663_m, c0664_f, c0664_m, c0665_f, c0665_m, c0666_f, c0666_m, c0667_f, c0667_m, c0703_f, c0703_m, c0704_f, c0704_m, c0705_f, c0705_m, c0706_f, c0706_m, c0707_f, c0707_m, c0708_f, c0708_m, c0709_f, c0709_m, c0710_f, c0710_m, c0712_f, c0712_m, c0713_f, c0713_m, c0714_f, c0714_m, c0715_f, c0715_m, c0716_f, c0716_m, c0717_f, c0717_m, c0718_f, c0718_m, c0719_f, c0719_m, c0720_f, c0720_m, c0721_f, c0721_m, c0732_f, c0732_m, c0733_f, c0733_m, c0734_f, c0734_m, c0735_f, c0735_m, c0736_f, c0736_m, c0737_f, c0737_m, c0738_f, c0738_m, c0739_f, c0739_m, c0740_f, c0740_m, c0828_f, c0828_m, c0829_f, c0829_m, c0830_f, c0830_m, c0831_f, c0831_m, c0832_f, c0832_m, c0833_f, c0833_m, c0834_f, c0834_m, c0835_f, c0835_m, c0836_f, c0836_m, c0837_f, c0837_m, c0838_f, c0838_m, c0839_f, c0839_m, c0840_f, c0840_m, c0841_f, c0841_m, c0843_f, c0843_m, c0844_f, c0844_m, c0845_f, c0845_m, c0846_f, c0846_m, c0847_f, c0847_m, c0848_f, c0848_m, c0849_f, c0849_m, c0850_f, c0850_m
juanignaciosl commented 5 years ago

Next error:

2018-11-12 15:07:11,992 [INFO]: rollback tasks.ca.statcan.census2016.data.CensusDBFromDA_all_t006_272927020f: (psycopg2.ProgrammingError) column "c0661_m" does not exist
LINE 7: ...                                           ROUND((c0661_m * ...
                                                             ^
HINT:  There is a column named "c0661_m" in table "obs_036b5c908ab14ef65a59981abd70629261a47067", but it cannot be referenced from this part of the query.
juanignaciosl commented 5 years ago

Working after the last fix.