PostHog / posthog

🦔 PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.
https://posthog.com
Other
20.24k stars 1.2k forks source link

developing locally | migrations fail #12530

Open mtksugi opened 1 year ago

mtksugi commented 1 year ago

Bug description

I intend to develop Posthog locally.

I'm setting my PC according to this, occrured an error at Prepare databases's migrations.

Target is posthog/clickhouse/migrations/0019_group_analytics_materialized_columns.py

2022-10-30T11:35:27.338080Z [info     ] Applying migration 0019_group_analytics_materialized_columns... [migrations] pid=13072 tid=4576576960
2022-10-30T11:35:27.338217Z [info     ]     Executing python operation create_materialized_columns [migrations] pid=13072 tid=4576576960
Traceback (most recent call last):
  File "manage.py", line 21, in <module>
    main()
  File "manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/Users/mtk/Product/PostHog/posthog/posthog/management/commands/migrate_clickhouse.py", line 42, in handle
    self.migrate(CLICKHOUSE_HTTP_URL, options)
  File "/Users/mtk/Product/PostHog/posthog/posthog/management/commands/migrate_clickhouse.py", line 80, in migrate
    database.migrate(MIGRATIONS_PACKAGE_NAME, options["upto"], replicated=CLICKHOUSE_REPLICATION)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/infi/clickhouse_orm/database.py", line 352, in migrate
    operation.apply(self)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/infi/clickhouse_orm/migrations.py", line 250, in apply
    self._func(database)
  File "/Users/mtk/Product/PostHog/posthog/posthog/clickhouse/migrations/0019_group_analytics_materialized_columns.py", line 8, in create_materialized_columns
    materialize("events", "$group_0", "$group_0")
  File "/Users/mtk/Product/PostHog/posthog/ee/clickhouse/materialized_columns/columns.py", line 62, in materialize
    if (property, table_column) in get_materialized_columns(table, use_cache=False):
  File "/Users/mtk/Product/PostHog/posthog/posthog/cache_utils.py", line 16, in memoized_fn
    return fn(*args)
  File "/Users/mtk/Product/PostHog/posthog/ee/clickhouse/materialized_columns/columns.py", line 40, in get_materialized_columns
    rows = sync_execute(
  File "/Users/mtk/Product/PostHog/posthog/posthog/client.py", line 163, in sync_execute
    prepared_sql, prepared_args, tags = _prepare_query(client=client, query=query, args=args)
  File "/Users/mtk/Product/PostHog/posthog/posthog/client.py", line 461, in _prepare_query
    rendered_sql = client.substitute_params(query, args)
TypeError: substitute_params() missing 1 required positional argument: 'context'

As a test, delete [0019_group_analytics_materialized_columns.py] and retry, then fail at next [0021_session_recording_events_materialize_full_snapshot.py].

2022-10-30T11:34:51.420491Z [info     ] Applying migration 0021_session_recording_events_materialize_full_snapshot... [migrations] pid=13055 tid=4584170944
2022-10-30T11:34:51.420720Z [info     ]     Executing python operation create_has_full_snapshot_materialized_column [migrations] pid=13055 tid=4584170944
2022-10-30T11:34:51.425960Z [warning  ] Failed to connect to localhost:9000 [clickhouse_driver.connection] pid=13055 tid=4584170944
Traceback (most recent call last):
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 325, in connect
    return self._init_connection(host, port)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 300, in _init_connection
    self.send_hello()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 401, in send_hello
    self.fout.flush()
  File "clickhouse_driver/bufferedwriter.pyx", line 46, in clickhouse_driver.bufferedwriter.BufferedWriter.flush
  File "clickhouse_driver/bufferedwriter.pyx", line 127, in clickhouse_driver.bufferedwriter.BufferedSocketWriter.write_into_stream
OSError: [Errno 22] Invalid argument
Traceback (most recent call last):
  File "manage.py", line 21, in <module>
    main()
  File "manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/Users/mtk/Product/PostHog/posthog/posthog/management/commands/migrate_clickhouse.py", line 42, in handle
    self.migrate(CLICKHOUSE_HTTP_URL, options)
  File "/Users/mtk/Product/PostHog/posthog/posthog/management/commands/migrate_clickhouse.py", line 80, in migrate
    database.migrate(MIGRATIONS_PACKAGE_NAME, options["upto"], replicated=CLICKHOUSE_REPLICATION)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/infi/clickhouse_orm/database.py", line 352, in migrate
    operation.apply(self)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/infi/clickhouse_orm/migrations.py", line 250, in apply
    self._func(database)
  File "/Users/mtk/Product/PostHog/posthog/posthog/clickhouse/migrations/0021_session_recording_events_materialize_full_snapshot.py", line 10, in create_has_full_snapshot_materialized_column
    sync_execute(
  File "/Users/mtk/Product/PostHog/posthog/posthog/client.py", line 183, in sync_execute
    raise err
  File "/Users/mtk/Product/PostHog/posthog/posthog/client.py", line 170, in sync_execute
    result = client.execute(
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/client.py", line 292, in execute
    with self.disconnect_on_error(query, settings):
  File "/Users/mtk/opt/anaconda3/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/client.py", line 235, in disconnect_on_error
    self.connection.force_connect()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 219, in force_connect
    self.connect()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 346, in connect
    raise err
clickhouse_driver.errors.NetworkError: Code: 210. Invalid argument (localhost:9000)

Please teach what should i do?

How to reproduce

same no matter how many times I try

Environment

macOS 10.14.6 Mojave

timgl commented 1 year ago

Looks like perhaps you have an old version of one of the requirements? Perhaps try doing pip install -r requirements.txt again?

mtksugi commented 1 year ago

Thanks reply timgl.

The latest requirements.txt is applied.

I'm curious, when I'm applying 0.2.1 of clickhouse-driver as per requirements.txt, occur below error

2022-11-01T01:04:18.701765Z [info     ] Applying migration 0019_group_analytics_materialized_columns... [migrations] pid=15720 tid=4625089984
2022-11-01T01:04:18.702002Z [info     ]     Executing python operation create_materialized_columns [migrations] pid=15720 tid=4625089984
2022-11-01T01:04:18.714628Z [warning  ] Failed to connect to localhost:9000 [clickhouse_driver.connection] pid=15720 tid=4625089984
Traceback (most recent call last):
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 318, in connect
    return self._init_connection(host, port)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 293, in _init_connection
    self.send_hello()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 391, in send_hello
    self.fout.flush()
  File "clickhouse_driver/bufferedwriter.pyx", line 46, in clickhouse_driver.bufferedwriter.BufferedWriter.flush
  File "clickhouse_driver/bufferedwriter.pyx", line 127, in clickhouse_driver.bufferedwriter.BufferedSocketWriter.write_into_stream
OSError: [Errno 22] Invalid argument
Traceback (most recent call last):
  File "manage.py", line 21, in <module>
    main()
  File "manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/Users/mtk/Product/PostHog/posthog/posthog/management/commands/migrate_clickhouse.py", line 42, in handle
    self.migrate(CLICKHOUSE_HTTP_URL, options)
  File "/Users/mtk/Product/PostHog/posthog/posthog/management/commands/migrate_clickhouse.py", line 80, in migrate
    database.migrate(MIGRATIONS_PACKAGE_NAME, options["upto"], replicated=CLICKHOUSE_REPLICATION)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/infi/clickhouse_orm/database.py", line 352, in migrate
    operation.apply(self)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/infi/clickhouse_orm/migrations.py", line 250, in apply
    self._func(database)
  File "/Users/mtk/Product/PostHog/posthog/posthog/clickhouse/migrations/0019_group_analytics_materialized_columns.py", line 8, in create_materialized_columns
    materialize("events", "$group_0", "$group_0")
  File "/Users/mtk/Product/PostHog/posthog/ee/clickhouse/materialized_columns/columns.py", line 62, in materialize
    if (property, table_column) in get_materialized_columns(table, use_cache=False):
  File "/Users/mtk/Product/PostHog/posthog/posthog/cache_utils.py", line 16, in memoized_fn
    return fn(*args)
  File "/Users/mtk/Product/PostHog/posthog/ee/clickhouse/materialized_columns/columns.py", line 40, in get_materialized_columns
    rows = sync_execute(
  File "/Users/mtk/Product/PostHog/posthog/posthog/client.py", line 183, in sync_execute
    raise err
  File "/Users/mtk/Product/PostHog/posthog/posthog/client.py", line 170, in sync_execute
    result = client.execute(
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/client.py", line 252, in execute
    self.connection.force_connect()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 212, in force_connect
    self.connect()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 339, in connect
    raise err
clickhouse_driver.errors.NetworkError: Code: 210. Invalid argument (localhost:9000)

As a test, when I'm applying latest version(0.2.4) of clickhouse-driver, occur below error

2022-10-30T11:35:27.338080Z [info     ] Applying migration 0019_group_analytics_materialized_columns... [migrations] pid=13072 tid=4576576960
2022-10-30T11:35:27.338217Z [info     ]     Executing python operation create_materialized_columns [migrations] pid=13072 tid=4576576960
Traceback (most recent call last):
  File "manage.py", line 21, in <module>
    main()
  File "manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/Users/mtk/Product/PostHog/posthog/posthog/management/commands/migrate_clickhouse.py", line 42, in handle
    self.migrate(CLICKHOUSE_HTTP_URL, options)
  File "/Users/mtk/Product/PostHog/posthog/posthog/management/commands/migrate_clickhouse.py", line 80, in migrate
    database.migrate(MIGRATIONS_PACKAGE_NAME, options["upto"], replicated=CLICKHOUSE_REPLICATION)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/infi/clickhouse_orm/database.py", line 352, in migrate
    operation.apply(self)
  File "/Users/mtk/Product/PostHog/posthog/env/lib/python3.8/site-packages/infi/clickhouse_orm/migrations.py", line 250, in apply
    self._func(database)
  File "/Users/mtk/Product/PostHog/posthog/posthog/clickhouse/migrations/0019_group_analytics_materialized_columns.py", line 8, in create_materialized_columns
    materialize("events", "$group_0", "$group_0")
  File "/Users/mtk/Product/PostHog/posthog/ee/clickhouse/materialized_columns/columns.py", line 62, in materialize
    if (property, table_column) in get_materialized_columns(table, use_cache=False):
  File "/Users/mtk/Product/PostHog/posthog/posthog/cache_utils.py", line 16, in memoized_fn
    return fn(*args)
  File "/Users/mtk/Product/PostHog/posthog/ee/clickhouse/materialized_columns/columns.py", line 40, in get_materialized_columns
    rows = sync_execute(
  File "/Users/mtk/Product/PostHog/posthog/posthog/client.py", line 163, in sync_execute
    prepared_sql, prepared_args, tags = _prepare_query(client=client, query=query, args=args)
  File "/Users/mtk/Product/PostHog/posthog/posthog/client.py", line 461, in _prepare_query
    rendered_sql = client.substitute_params(query, args)
TypeError: substitute_params() missing 1 required positional argument: 'context'

best regards.

fuziontech commented 1 year ago

Hi @mtksugi! I'd stick with 0.2.1 and try to resolve the Failed to connect to localhost:9000 [clickhouse_driver.connection] error. Are there any logs coming from ClickHouse container that would suggest that that container might be unhealthy?

mtksugi commented 1 year ago

Thanks reply fuziontech.

On /var/log/clickhouse-server/clickhouse-server.err.log, the following log is repeatedly.

2022.11.01 05:10:29.995569 [ 474 ] {} <Warning> StorageKafka (kafka_groups): Can't get assignment. Will keep trying.
2022.11.01 05:10:30.084034 [ 462 ] {} <Warning> StorageKafka (kafka_events_dead_letter_queue): Can't get assignment. Will keep trying.
2022.11.01 05:10:30.113241 [ 472 ] {} <Warning> StorageKafka (kafka_person_distinct_id): Can't get assignment. Will keep trying.
2022.11.01 05:10:30.143473 [ 471 ] {} <Warning> StorageKafka (kafka_session_recording_events): Can't get assignment. Will keep trying.
2022.11.01 05:10:30.159827 [ 463 ] {} <Warning> StorageKafka (kafka_person): Can't get assignment. Will keep trying.
2022.11.01 05:10:30.261841 [ 476 ] {} <Warning> StorageKafka (kafka_plugin_log_entries): Can't get assignment. Will keep trying.

Is this relevant?

fuziontech commented 1 year ago

ahh these are classic and totally normal. Just a bunch of log spam. I'd keep looking 👀