PostHog / posthog

🦔 PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.
https://posthog.com
Other
19.45k stars 1.14k forks source link

Bug Report: Cohort calculation OOMs #23392

Closed skoob13 closed 5 days ago

skoob13 commented 6 days ago

Bug Description

Duplicate of #20719

Bug description

Cohort calculation OOMs. Filters:

How to reproduce

No steps yet.

Additional context

Grafana log:

{"task_id": "7ec1092f-e870-40dc-a02a-4968800404f9", "request_id": "3e774739-6f31-4511-adc8-800de113818a", "id": 69287, "current_version": null, "new_version": 11, "event": "cohort_calculation_failed", "task_name": "posthog.tasks.calculate_cohort.calculate_cohort_ch", "timestamp": "2024-07-01T10:59:50.772922Z", "logger": "posthog.models.cohort.cohort", "level": "warning", "pid": 274, "tid": 281472934710592, "exception": "Traceback (most recent call last):\n  File \"/code/posthog/clickhouse/client/execute.py\", line 116, in sync_execute\n    result = client.execute(\n             ^^^^^^^^^^^^^^^\n  File \"/python-runtime/clickhouse_driver/client.py\", line 382, in execute\n    rv = self.process_ordinary_query(\n         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/python-runtime/clickhouse_driver/client.py\", line 580, in process_ordinary_query\n    return self.receive_result(with_column_types=with_column_types,\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/python-runtime/sentry_sdk/integrations/clickhouse_driver.py\", line 110, in _inner_end\n    res = f(*args, **kwargs)\n          ^^^^^^^^^^^^^^^^^^\n  File \"/python-runtime/clickhouse_driver/client.py\", line 213, in receive_result\n    return result.get_result()\n           ^^^^^^^^^^^^^^^^^^^\n  File \"/python-runtime/clickhouse_driver/result.py\", line 50, in get_result\n    for packet in self.packet_generator:\n  File \"/python-runtime/clickhouse_driver/client.py\", line 229, in packet_generator\n    packet = self.receive_packet()\n             ^^^^^^^^^^^^^^^^^^^^^\n  File \"/python-runtime/clickhouse_driver/client.py\", line 246, in receive_packet\n    raise packet.exception\nclickhouse_driver.errors.ServerException: Code: 241.\nDB::Exception: Memory limit (for query) exceeded: would use 42.07 GiB (attempt to allocate chunk of 0 bytes), maximum: 42.00 GiB.: While executing AggregatingTransform. Stack trace:\n\n0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000bae0be8 in /usr/bin/clickhouse\n1. DB::Exception::Exception<char const*, char const*, String, long&, String, char const*, std::basic_string_view<char, std::char_traits<char>>>(int, FormatStringHelperImpl<std::type_identity<char const*>::type, std::type_identity<char const*>::type, std::type_identity<String>::type, std::type_identity<long&>::type, std::type_identity<String>::type, std::type_identity<char const*>::type, std::type_identity<std::basic_string_view<char, std::char_traits<char>>>::type>, char const*&&, char const*&&, String&&, long&, String&&, char const*&&, std::basic_string_view<char, std::char_traits<char>>&&) @ 0x000000000baf3188 in /usr/bin/clickhouse\n2. MemoryTracker::allocImpl(long, bool, MemoryTracker*, double) @ 0x000000000baf2db8 in /usr/bin/clickhouse\n3. MemoryTracker::allocImpl(long, bool, MemoryTracker*, double) @ 0x000000000baf28b8 in /usr/bin/clickhouse\n4. DB::Aggregator::checkLimits(unsigned long, bool&) const @ 0x000000000f7b2b50 in /usr/bin/clickhouse\n5. DB::Aggregator::executeOnBlock(std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>, unsigned long, unsigned long, DB::AggregatedDataVariants&, std::vector<DB::IColumn const*, std::allocator<DB::IColumn const*>>&, std::vector<std::vector<DB::IColumn const*, std::allocator<DB::IColumn const*>>, std::allocator<std::vector<DB::IColumn const*, std::allocator<DB::IColumn const*>>>>&, bool&) const @ 0x000000000f825890 in /usr/bin/clickhouse\n6. DB::AggregatingTransform::work() @ 0x000000001117e5b8 in /usr/bin/clickhouse\n7. DB::ExecutionThreadContext::executeTask() @ 0x0000000010f5ea8c in /usr/bin/clickhouse\n8. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x0000000010f56c54 in /usr/bin/clickhouse\n9. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<DB::PipelineExecutor::spawnThreads()::$_0, void ()>>(std::__function::__policy_storage const*) @ 0x0000000010f57d9c in /usr/bin/clickhouse\n10. ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>) @ 0x000000000bbabf30 in /usr/bin/clickhouse\n11. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000bbaf020 in /usr/bin/clickhouse\n12. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000bbae028 in /usr/bin/clickhouse\n13. ? @ 0x000000000008597c in /usr/lib/aarch64-linux-gnu/libc.so.6\n14. ? @ 0x00000000000eba4c in /usr/lib/aarch64-linux-gnu/libc.so.6\n\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File \"/code/posthog/models/cohort/cohort.py\", line 210, in calculate_people_ch\n    count = recalculate_cohortpeople(self, pending_version, initiating_user_id=initiating_user_id)\n            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/code/posthog/models/cohort/util.py\", line 318, in recalculate_cohortpeople\n    sync_execute(\n  File \"/code/posthog/utils.py\", line 1338, in inner\n    return inner._impl(*args, **kwargs)  # type: ignore\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/code/posthog/clickhouse/client/execute.py\", line 130, in sync_execute\n    raise err from e\nposthog.errors.CHQueryErrorMemoryLimitExceeded: Query exceeds memory limits. Try reducing its scope by changing the time range."}

Ticket: https://posthoghelp.zendesk.com/agent/tickets/15242

Debug info

- [ ] PostHog Cloud, Debug information: [please copy/paste from https://us.posthog.com/settings/project-details#variables or https://eu.posthog.com/settings/project-details#variables]
- [ ] PostHog Hobby self-hosted with `docker compose`, version/commit: [please provide]
- [ ] PostHog self-hosted with Kubernetes (deprecated, see [`Sunsetting Kubernetes support`](https://posthog.com/blog/sunsetting-helm-support-posthog)), version/commit: [please provide]
skoob13 commented 5 days ago

Fixed by #23419