PostHog / posthog

πŸ¦” PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.
https://posthog.com
Other
20.72k stars 1.24k forks source link

Error upgrading hobby deploy #10694

Open mariusandra opened 2 years ago

mariusandra commented 2 years ago

Bug description

This has been reported by users when trying to upgrade the PostHog hobby deploy instance lately.

infi.clickhouse_orm.database.ServerError: Code: 60. DB::Exception: Table posthog.infi_clickhouse_orm_migrations doesn’t exist. (UNKNOWN_TABLE) (version 22.3.7.28 (official build))

I'm not exactly sure where it comes from. Changes in https://github.com/PostHog/infi.clickhouse_orm perhaps? This? 🀷

I also saw this locally: one day after another upgrade master I couldn't run clickhouse migrations anymore and saw that error. I just wiped my environment, and now can't reproduce it.

How to reproduce

I tried the following.

pip install -r requirements.txt
docker-compose -f docker-compose.dev.yml stop clickhouse kafka zookeeper
yes | docker-compose -f docker-compose.dev.yml rm -v clickhouse kafka zookeeper

git checkout b62a527f1ba936c001d8078286c636915e4a6405
docker-compose -f docker-compose.arm64.yml pull clickhouse kafka zookeeper
docker-compose -f docker-compose.arm64.yml create clickhouse kafka zookeeper
docker-compose -f docker-compose.arm64.yml start clickhouse kafka zookeeper
python manage.py migrate_clickhouse

git checkout master
pip install -r requirements.txt
python manage.py migrate_clickhouse

Using the git commit just one before last changes to the orm library went in to posthog/posthog:

5c9dafb4e56854552499fd866b650e7f2795e522 refactor: clickhouse-orm, bump sha (#9975)
c9f0fee94ade69059d03fd48978e6b142e58a8e0 the one before

da9b9396ad05bf36d94ea8557bf969819ea1d5db chore(deps): Update `clickhouse_orm` for PostHog/infi.clickhouse_orm#9 (#9843)
b62a527f1ba936c001d8078286c636915e4a6405 the one before... used in the example above

But got different errors:

infi.clickhouse_orm.database.ServerError: Code: 57. DB::Exception: There was an error on [localhost:9000]: Code: 57. DB::Exception: Table default.kafka_person_distinct_id already exists. (TABLE_ALREADY_EXISTS) (version 22.3.7.28 (official build)). (TABLE_ALREADY_EXISTS) (version 22.3.7.28 (official build))
 (0)
infi.clickhouse_orm.database.ServerError: Code: 57. DB::Exception: There was an error on [localhost:9000]: Code: 57. DB::Exception: Table default.person_distinct_id_mv already exists. (TABLE_ALREADY_EXISTS) (version 22.3.8.39 (official build)). (TABLE_ALREADY_EXISTS) (version 22.3.8.39 (official build))
 (0)

I'm not sure what's up, but leaving these breadcrumbs up for others to follow.

Environment

Additional context

Thank you for your bug report – we love squashing them!

kjcsb1 commented 2 years ago

I have experienced this across multiple self-hosted PostHog (ClickHouse-based) versions.

The behaviour for me is as follows:

  1. Install Posthog
  2. Runs normally
  3. Stop docker
  4. Start docker again - the issue occurs

infi.clickhouse_orm.database.ServerError: Code: 60. DB::Exception: Table posthog.infi_clickhouse_orm_migrations doesn't exist. (UNKNOWN_TABLE) (version 22.3.6.5 (official build))

I have now attempted an upgrade. Note that the docker containers were not running at the time I ran the script:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/posthog/posthog/HEAD/bin/upgrade-hobby)"
...
Starting ubuntu_redis_1          ... done
Starting ubuntu_zookeeper_1      ... done
Starting ubuntu_db_1             ... done
Starting ubuntu_object_storage_1 ... done
Starting ubuntu_kafka_1          ... done
Starting ubuntu_clickhouse_1     ... done
Creating ubuntu_asyncmigrationscheck_run ... done

πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»
️You indicated your instance is behind a proxy (IS_BEHIND_PROXY env var),
 but you haven't configured any trusted proxies. See
 https://posthog.com/docs/configuring-posthog/running-behind-proxy for details.
πŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”Ί

Failed to connect to clickhouse:9000 
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 318, in connect
    return self._init_connection(host, port)
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 282, in _init_connection
    self.socket = self._create_socket(host, port)
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 254, in _create_socket
    raise err
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 245, in _create_socket
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
Traceback (most recent call last):
  File "manage.py", line 21, in <module>
    main()
  File "manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 395, in execute
    django.setup()
  File "/usr/local/lib/python3.8/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/usr/local/lib/python3.8/site-packages/django/apps/registry.py", line 122, in populate
    app_config.ready()
  File "/home/posthog/code/posthog/apps.py", line 49, in ready
    in_range, version = service_version_requirement.is_service_in_accepted_version()
  File "/home/posthog/code/posthog/version_requirement.py", line 28, in is_service_in_accepted_version
    service_version = self.get_service_version()
  File "/home/posthog/code/posthog/version_requirement.py", line 36, in get_service_version
    return get_clickhouse_version()
  File "/home/posthog/code/posthog/version_requirement.py", line 58, in get_clickhouse_version
    rows = client.execute("SELECT version()")
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/client.py", line 252, in execute
    self.connection.force_connect()
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 212, in force_connect
    self.connect()
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 339, in connect
    raise err
clickhouse_driver.errors.NetworkError: Code: 210. Connection refused (clickhouse:9000)

I then started the docker containers

...
worker_1                | AXES: BEGIN LOG
worker_1                | AXES: BEGIN LOG
worker_1                | AXES: Using django-axes version 5.9.0
worker_1                | AXES: Using django-axes version 5.9.0
worker_1                | AXES: blocking by IP only.
worker_1                | AXES: blocking by IP only.
worker_1                | List of clickhouse migrations to be applied:
worker_1                | Traceback (most recent call last):
worker_1                |   File "manage.py", line 21, in <module>
worker_1                |     main()
worker_1                |   File "manage.py", line 17, in main
worker_1                |     execute_from_command_line(sys.argv)
worker_1                |   File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
worker_1                |     utility.execute()
worker_1                |   File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
worker_1                |     self.fetch_command(subcommand).run_from_argv(self.argv)
worker_1                |   File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
worker_1                |     self.execute(*args, **cmd_options)
worker_1                |   File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
worker_1                |     output = self.handle(*args, **options)
worker_1                |   File "/home/posthog/code/ee/management/commands/migrate_clickhouse.py", line 42, in handle
worker_1                |     self.migrate(CLICKHOUSE_HTTP_URL, options)
worker_1                |   File "/home/posthog/code/ee/management/commands/migrate_clickhouse.py", line 55, in migrate
worker_1                |     migrations = list(self.get_migrations(database, options["upto"]))
worker_1                |   File "/home/posthog/code/ee/management/commands/migrate_clickhouse.py", line 85, in get_migrations
worker_1                |     applied_migrations = self.get_applied_migrations(database)
worker_1                |   File "/home/posthog/code/ee/management/commands/migrate_clickhouse.py", line 95, in get_applied_migrations
worker_1                |     return database._get_applied_migrations(MIGRATIONS_PACKAGE_NAME, replicated=CLICKHOUSE_REPLICATION)
worker_1                |   File "/usr/local/lib/python3.8/site-packages/infi/clickhouse_orm/database.py", line 379, in _get_applied_migrations
worker_1                |     return set(obj.module_name for obj in self.select(query))
worker_1                |   File "/usr/local/lib/python3.8/site-packages/infi/clickhouse_orm/database.py", line 379, in <genexpr>
worker_1                |     return set(obj.module_name for obj in self.select(query))
worker_1                |   File "/usr/local/lib/python3.8/site-packages/infi/clickhouse_orm/database.py", line 276, in select
worker_1                |     r = self._send(query, settings, True)
worker_1                |   File "/usr/local/lib/python3.8/site-packages/infi/clickhouse_orm/database.py", line 389, in _send
worker_1                |     raise ServerError(r.text)
worker_1                | infi.clickhouse_orm.database.ServerError: Code: 60. DB::Exception: Table posthog.infi_clickhouse_orm_migrations doesn't exist. (UNKNOWN_TABLE) (version 22.3.8.39 (official build))
worker_1                |  (0)
ubuntu_worker_1 exited with code 1

So then I attempted an upgrade again but this time with docker running:

Creating ubuntu_asyncmigrationscheck_run ... done

πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»
️You indicated your instance is behind a proxy (IS_BEHIND_PROXY env var),
 but you haven't configured any trusted proxies. See
 https://posthog.com/docs/configuring-posthog/running-behind-proxy for details.
πŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”Ί

πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»πŸ”»
Skipping async migrations setup. This is unsafe in production!
πŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”ΊπŸ”Ί

AXES: BEGIN LOG
AXES: BEGIN LOG
AXES: Using django-axes version 5.9.0
AXES: Using django-axes version 5.9.0
AXES: blocking by IP only.
AXES: blocking by IP only.
Traceback (most recent call last):
  File "manage.py", line 21, in <module>
    main()
  File "manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/home/posthog/code/ee/management/commands/run_async_migrations.py", line 59, in handle
    necessary_migrations = get_necessary_migrations()
  File "/home/posthog/code/ee/management/commands/run_async_migrations.py", line 32, in get_necessary_migrations
    is_migration_required = ALL_ASYNC_MIGRATIONS[migration_name].is_required()
  File "/home/posthog/code/posthog/async_migrations/migrations/0002_events_sample_by.py", line 169, in is_required
    table_engine = sync_execute(
IndexError: list index out of range
danielthedifficult commented 2 years ago

@fuziontech this was likely addressed by the same PR that fixed #10792 :)

allan-simon commented 2 years ago

I'm also having this issue though my postgres and clickhouse are on an other server (installed directly without docker) and not managed by docker-compose

kapil371 commented 1 year ago

i am also facing the same issue