ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
13.89k stars 3.4k forks source link

"Cleanup Job Schedule" fails with external DB, schema other than public #10838

Open tobias-oe opened 3 years ago

tobias-oe commented 3 years ago

Please confirm the following

Summary

When you use an external postgres database for AWX and don't use the default schema "public", then the cleanup job "Cleanup Job Schedule" fails, because there is a fixed call for the table "public.main_jobevent". This table cannot exist, if you use another schema e.g. "awx".

Please honor the environment variables passed by deployment scripts, so you don't run in errors if using not the default schema name.

AWX version

19.2.2

Installation method

kubernetes

Modifications

no

Ansible version

No response

Operating system

No response

Web browser

No response

Steps to reproduce

You need an installation of AWX configured with an external postgres DB using a schema not named "public".

  1. Run the scheduled job "Cleanup Job Schedule"
  2. Check the output of this job

Expected results

The cleanup job finishes successfully.

Actual results

The cleanup job fails with the following exception:

Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/db/backends/utils.py", line 82, in _execute
    return self.cursor.execute(sql)
psycopg2.errors.UndefinedTable: relation "public.main_jobevent" does not exist
LINE 1: ...ild FROM pg_catalog.pg_inherits WHERE inhparent = 'public.ma...
                                                             ^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/bin/awx-manage", line 8, in <module>
    sys.exit(manage())
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/awx/__init__.py", line 171, in manage
    execute_from_command_line(sys.argv)
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/core/management/base.py", line 323, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/core/management/base.py", line 364, in execute
    output = self.handle(*args, **options)
  File "/usr/lib64/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/awx/main/management/commands/cleanup_jobs.py", line 415, in handle
    skipped_partition, deleted_partition = func()
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/awx/main/management/commands/cleanup_jobs.py", line 174, in cleanup_jobs_partition
    return self.cleanup(Job)
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/awx/main/management/commands/cleanup_jobs.py", line 169, in cleanup
    skipped, deleted = delete_meta.delete()
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/awx/main/management/commands/cleanup_jobs.py", line 143, in delete
    self.find_partitions_to_drop()
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/awx/main/management/commands/cleanup_jobs.py", line 112, in find_partitions_to_drop
    cursor.execute(query)
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/django/db/backends/utils.py", line 82, in _execute
    return self.cursor.execute(sql)
django.db.utils.ProgrammingError: relation "public.main_jobevent" does not exist
LINE 1: ...ild FROM pg_catalog.pg_inherits WHERE inhparent = 'public.ma...
                                                             ^

Additional information

No response

shanemcd commented 3 years ago

If are hardcoding the public schema name anywhere in the AWX code, we should stop that.

shanemcd commented 3 years ago

https://github.com/ansible/awx/blob/28709d7bd9f91a302490687b025f36e840842b51/awx/main/management/commands/cleanup_jobs.py#L108

kukacz commented 2 years ago

This bug has caused us a lot of troubles with external database running out of space frequently. Is anybody working on a fix?

Also I wonder if there is better workaround in awx-operator driven K8s deployment than running manual Job and UnifiedJob object deletions through awx-manage? We do it following the steps: https://tobschall.de/2019/05/07/ansible-tower-cleanup/. It's very slow for the amount of jobs we are cleaning. I thought about manually patching the cleanup_jobs.py in our container, that's not possible though due to restricted permissions.

kukacz commented 2 years ago

So workaround seems to be renaming the database schema to "public" after dropping the existing public schema.

postgres=# \c awxtest;
awxtest=# DROP SCHEMA public CASCADE;
awxtest=# ALTER SCHEMA awxuser RENAME TO public;
kenichimilo commented 2 years ago

Hi @shanemcd, I believe this bug has been present since moving up from 18.0, any deployment using an external db that is not in the public schema will have this issue. All the other jobs are perfectly handled i suppose could we get the hardcoded schema adapted and a new build released.. Just upgraded to 19.5.1 and it is still there.. Thanks

kimbernator commented 2 years ago

Any progress on this? Would definitely be helpful for us as well.