ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
13.81k stars 3.39k forks source link

DB migration exception when upgrading from v8 to v11.2 #7307

Open benapetr opened 4 years ago

benapetr commented 4 years ago
ISSUE TYPE
SUMMARY
ENVIRONMENT
STEPS TO REPRODUCE

Install 11.2.0 over 8.0.0 - I don't think this can be easily reproduced

EXPECTED RESULTS

Flawless update

ACTUAL RESULTS

This exception in awx_task container

2020-06-09 20:55:33,374 DEBUG    rbac_migrations Migrating workflowapproval to new organization field
2020-06-09 20:55:33,374 DEBUG    rbac_migrations Class workflowapproval has no organization migration
2020-06-09 20:55:33,374 INFO     rbac_migrations Unified organization migration completed in 1.4625 seconds
2020-06-09 20:55:35,445 DEBUG    rbac_migrations Removed from parents of roles {148, 149} of JobTemplate object (28)
2020-06-09 20:55:35,677 DEBUG    rbac_migrations Removed from parents of roles {344, 345} of JobTemplate object (53)
2020-06-09 20:55:36,209 DEBUG    rbac_migrations Removed from parents of roles {765, 766} of JobTemplate object (126)
2020-06-09 20:55:36,663 DEBUG    rbac_migrations Removed from parents of roles {2150, 2151} of JobTemplate object (245)
2020-06-09 20:55:37,789 DEBUG    rbac_migrations No changes to role parents for 171 resources
2020-06-09 20:55:37,789 DEBUG    rbac_migrations Added parents to 0 roles
2020-06-09 20:55:37,790 DEBUG    rbac_migrations Removed parents from 8 roles
2020-06-09 20:55:37,790 INFO     rbac_migrations Updated implicit parents of 4 resources
2020-06-09 20:55:37,790 INFO     rbac_migrations Rebuild parentage completed in 2.555782 seconds
2020-06-09 20:55:38,008 DEBUG    rbac_migrations Setting admin_role on jt 28 for users [47, 26, 23, 37, 16, 24, 190] via inventory.organization 2
2020-06-09 20:55:38,023 DEBUG    rbac_migrations Setting admin_role on jt 53 for users [47, 26, 23, 37, 24, 190] via inventory.organization 2
2020-06-09 20:55:38,038 DEBUG    rbac_migrations Setting admin_role on jt 126 for users [47, 26, 23, 37, 16, 24, 190] via inventory.organization 2
2020-06-09 20:55:38,052 DEBUG    rbac_migrations Setting admin_role on jt 245 for users [47, 26, 23, 37, 24, 190] via inventory.organization 2
2020-06-09 20:55:38,059 INFO     rbac_migrations Added explicit JT permission for 26 users in 0.0690 seconds
2020-06-09 20:55:39,135 INFO     awx.main.migrations Automatically created uuid4 identifier for 91 workflow nodes
Operations to perform:
  Apply all migrations: auth, conf, contenttypes, main, oauth2_provider, sessions, sites, social_django, sso, taggit
Running migrations:
  Applying main.0099_v361_license_cleanup... OK
  Applying main.0100_v370_projectupdate_job_tags... OK
  Applying main.0101_v370_generate_new_uuids_for_iso_nodes... OK
  Applying main.0102_v370_unifiedjob_canceled... OK
  Applying main.0103_v370_remove_computed_fields... OK
  Applying main.0104_v370_cleanup_old_scan_jts... OK
  Applying main.0105_v370_remove_jobevent_parent_and_hosts... OK
  Applying main.0106_v370_remove_inventory_groups_with_active_failures... OK
  Applying main.0107_v370_workflow_convergence_api_toggle... OK
  Applying main.0108_v370_unifiedjob_dependencies_processed... OK
  Applying main.0109_v370_job_template_organization_field... OK
  Applying main.0110_v370_instance_ip_address... OK
  Applying main.0111_v370_delete_channelgroup... OK
  Applying main.0112_v370_workflow_node_identifier... OK
  Applying main.0113_v370_event_bigint... OK
  Applying main.0114_v370_remove_deprecated_manual_inventory_sources... OK
  Applying taggit.0003_taggeditem_add_unique_index... OK
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "auth_user_username_key"
DETAIL:  Key (username)=(admin) already exists.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/bin/awx-manage", line 8, in <module>
    sys.exit(manage())
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/__init__.py", line 152, in manage
    execute_from_command_line(sys.argv)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/management/base.py", line 323, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/management/base.py", line 364, in execute
    output = self.handle(*args, **options)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/management/commands/shell.py", line 92, in handle
    exec(sys.stdin.read())
  File "<string>", line 1, in <module>
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/contrib/auth/models.py", line 162, in create_superuser
    return self._create_user(username, email, password, **extra_fields)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/contrib/auth/models.py", line 145, in _create_user
    user.save(using=self._db)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/contrib/auth/base_user.py", line 66, in save
    super().save(*args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/models/base.py", line 741, in save
    force_update=force_update, update_fields=update_fields)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/models/base.py", line 779, in save_base
    force_update, using, update_fields,
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/models/base.py", line 870, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/models/base.py", line 908, in _do_insert
    using=using, raw=raw)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/models/query.py", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1375, in execute_sql
    cursor.execute(sql, params)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: duplicate key value violates unique constraint "auth_user_username_key"
DETAIL:  Key (username)=(admin) already exists.
An organization is already in the system, exiting.
(changed: False)
Instance already registered awx
Instance Group already registered tower
2020-06-09 20:56:00,723 CRIT Supervisor is running as root.  Privileges were not dropped because no user is specified in the config file.  If you intend to run as root, you can set user=root in the config file to avoid this message.
2020-06-09 20:56:00,728 INFO RPC interface 'supervisor' initialized
ADDITIONAL INFORMATION

AWX however did start up and was probably operational to certain degree, I don't know if it was safe to continue with DB which clearly failed to upgrade, so I rolled back the VM snapshot

I would love some feedback on this exception, if it can be ignore or if it's something to fix. I also noted that after upgrade there was massive SQL traffic, about 200MB/s writes to disks caused by some massive inserts and deletes, which kept lasting until I killed the VM. SQL storage grew by few GB.

ryanpetrello commented 4 years ago

@benapetr,

Interesting. This doesn't look like a failure to run migrations, it looks like awx-manage createsuperuser and awx-manage create_preload_data are running (when they don't need to).

This could be a bug in the installer? Given the warnings above, though, if your install is working, it's probably harmless.

I also noted that after upgrade there was massive SQL traffic, about 200MB/s writes to disks caused by some massive inserts and deletes, which kept lasting until I killed the VM. SQL storage grew by few GB.

We recently added code that auto-migrates certain tables to use a bigint as their primary key, and this happens automatically on upgrade, so I expect this is what you were seeing:

https://github.com/ansible/awx/issues/6010 https://github.com/ansible/awx/pull/6032

If you've got a larger dataset (lots of historical job output), it may take awhile post-upgrade for the migration to finish.

HiramanSonawane commented 3 years ago

I am facing the same issue.

Operations to perform: Apply all migrations: auth, conf, contenttypes, main, oauth2_provider, sessions, sites, social_django, sso, taggit Running migrations: No migrations to apply. Traceback (most recent call last): File "/var/lib/awx/venv/awx/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) psycopg2.IntegrityError: duplicate key value violates unique constraint "auth_user_username_key" DETAIL: Key (username)=(admin) already exists.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):


Due to this, my AWX instance is not working properly. I can login to AWX web portal but it shows error message as: "Failed to get dashboard: 500 A server error has occurred."

If I scroll down in logs, I see another error as:

2020-09-22 12:33:26,939 ERROR awx.main.tasks Failed to rebuild schedule Cleanup Job Schedule_t1_1_2020-09-27 16:02:38+00:00. Traceback (most recent call last): File "/var/lib/awx/venv/awx/lib64/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) psycopg2.ProgrammingError: relation "main_channelgroup" does not exist LINE 1: ...oup"."group", "main_channelgroup"."channels" FROM "main_chan...

Any guidance what could be an issue here and how to resolve?

roercik85 commented 3 years ago

The same issue after awx_task image update

GBBMER1 commented 2 years ago

Same issue, anyone manage to solve this?

rchaud commented 10 months ago

Howdy, you guys might need to due some version jumping instead, try going from AWX 8 to 9 to 11. etc

I have a huge AWX deployment with version 9 which I managed to upgrade to version 23.2.0 without any issues. It did took me some time to find the right versions to jump to.