getsentry / snuba

Search the seas for your lost treasure.
Other
339 stars 57 forks source link

Widgets in the Dashboards are throwing 504 #5962

Closed sree-warrier closed 5 days ago

sree-warrier commented 5 months ago

Self-Hosted Version

23.11.2

CPU Architecture

x86_64

Docker Version

nil

Docker Compose Version

nil

Steps to Reproduce

We have created few telemetry dashboards with different types of widgets. While checking the data with longer timestamp (20 or 30 days) most of the widgets throw 504. Exact error is like this

GET /organizations/{orgSlug}/events/ 504

Attaching the screen-shot too.

What could be the reason for this, i can find proper doc on the widget get call from which database. We tried to changes the timeout configs of clickhous, but it didnt works. Have increased the resources for postgres and clickhouse too, still issues were seen. Let us know how we can track this.

Screenshot 2024-05-15 at 11 17 41 AM

Expected Result

All widget in dashboards should load without any issue for atleast 30days of time frame.

Actual Result

nil

Event ID

nil

hubertdeng123 commented 5 months ago

Can you check your docker compose logs of your web/nginx container? It might be helpful to see if there are any errors there.

sree-warrier commented 5 months ago

Have checked the logs, not seeing anything related to this. Enabled the logging to debug also, still we didnt find any log on this. Any other container logs we need to check ? Need to know how exactly the dashboards are loaded, which is the service calling the DB, we can check at that container level. Also we are confused now in which DB the dashboards data are saved and from which DB the data is pulled. Does any DB configs affect the functionality of Dashboards.

sree-warrier commented 5 months ago

Seeing this error exactly after 15sec, is there any hard coded connection timeout setting there. We were not able to find any configs. And meantime have seen some logs with web and snuba-api side


Sentry-Web

08:55:00 [INFO] sentry.access.api: api.access (method='POST' view='sentry.api.endpoints.relay.project_configs.RelayProjectConfigsEndpoint' response=200 user_id='None' is_app='None' token_type='None' is_frontend_request='False' organization_id='None' auth_id='None' path='/api/0/relays/projectconfigs/' caller_ip='10.207.16.188' user_agent='None' rate_limited='False' rate_limit_category='None' request_duration_seconds=0.017824172973632812 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:03 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.organization_tags.OrganizationTagsEndpoint' response=500 user_id='4' is_app='False' token_type='None' is_frontend_request='True' organization_id='1' auth_id='None' path='/api/0/organizations/xxxxxxxxx-sentry/tags/' caller_ip='x.x.x.x' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:123.0) Gecko/20100101 Firefox/123.0' rate_limited='False' rate_limit_category='None' request_duration_seconds=9.839881420135498 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:03 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.organization_events_stats.OrganizationEventsStatsEndpoint' response=500 user_id='4' is_app='False' token_type='None' is_frontend_request='True' organization_id='1' auth_id='None' path='/api/0/organizations/xxxxxxxxx-sentry/events-stats/' caller_ip='x.x.x.x' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:123.0) Gecko/20100101 Firefox/123.0' rate_limited='False' rate_limit_category='None' request_duration_seconds=26.53361201286316 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:03 [ERROR] django.request: Internal Server Error: /api/0/organizations/xxxxxxxxx-sentry/events-stats/ (status_code=500 request=<WSGIRequest: GET '/api/0/organizations/xxxxxxxxx-sentry/events-stats/?environment=release&field=transaction&field=count%28%29&interval=1h&orderby=-count%28%29&partial=1&project=4&query=message%3AFETCH_HIERARCHY_FAILED%20AND%20message%3AxxxxxxxxxException%20AND%20device.online%3ATrue%20release%3Acom.xxxxxxxxx.lockscreenM%4010.2.10-motorola%2B20240420%20&referrer=api.dashboards.widget.line-chart&statsPeriod=30d&topEvents=5&yAxis=count%28%29'>)
08:55:03 [ERROR] django.request: Internal Server Error: /api/0/organizations/xxxxxxxxx-sentry/tags/ (status_code=500 request=<WSGIRequest: GET '/api/0/organizations/xxxxxxxxx-sentry/tags/?project=4&statsPeriod=30d&use_cache=1'>)
08:55:03 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.organization_events_meta.OrganizationEventsMetaEndpoint' response=500 user_id='4' is_app='False' token_type='None' is_frontend_request='True' organization_id='1' auth_id='None' path='/api/0/organizations/xxxxxxxxx-sentry/events-meta/' caller_ip='x.x.x.x' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:123.0) Gecko/20100101 Firefox/123.0' rate_limited='False' rate_limit_category='None' request_duration_seconds=10.258557558059692 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:03 [ERROR] django.request: Internal Server Error: /api/0/organizations/xxxxxxxxx-sentry/events-meta/ (status_code=500 request=<WSGIRequest: GET '/api/0/organizations/xxxxxxxxx-sentry/events-meta/?environment=release&project=4&query=message%3AREGISTER_DEVICE_FINGERPRINT_FAILED%20AND%20message%3AxxxxxxxxxException%20AND%20device.online%3ATrue&statsPeriod=30d'>)
08:55:03 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.organization_measurements_meta.OrganizationMeasurementsMeta' response=500 user_id='4' is_app='False' token_type='None' is_frontend_request='True' organization_id='1' auth_id='None' path='/api/0/organizations/xxxxxxxxx-sentry/measurements-meta/' caller_ip='x.x.x.x' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:123.0) Gecko/20100101 Firefox/123.0' rate_limited='False' rate_limit_category='None' request_duration_seconds=13.971519231796265 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:03 [ERROR] django.request: Internal Server Error: /api/0/organizations/xxxxxxxxx-sentry/measurements-meta/ (status_code=500 request=<WSGIRequest: GET '/api/0/organizations/xxxxxxxxx-sentry/measurements-meta/?project=4&statsPeriod=30d&utc=true'>)
08:55:03 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.organization_events_stats.OrganizationEventsStatsEndpoint' response=500 user_id='4' is_app='False' token_type='None' is_frontend_request='True' organization_id='1' auth_id='None' path='/api/0/organizations/xxxxxxxxx-sentry/events-stats/' caller_ip='x.x.x.x' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:123.0) Gecko/20100101 Firefox/123.0' rate_limited='False' rate_limit_category='None' request_duration_seconds=13.900061130523682 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:03 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.organization_tags.OrganizationTagsEndpoint' response=500 user_id='4' is_app='False' token_type='None' is_frontend_request='True' organization_id='1' auth_id='None' path='/api/0/organizations/xxxxxxxxx-sentry/tags/' caller_ip='x.x.x.x' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:123.0) Gecko/20100101 Firefox/123.0' rate_limited='False' rate_limit_category='None' request_duration_seconds=13.982542037963867 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:03 [ERROR] django.request: Internal Server Error: /api/0/organizations/xxxxxxxxx-sentry/tags/ (status_code=500 request=<WSGIRequest: GET '/api/0/organizations/xxxxxxxxx-sentry/tags/?project=4&statsPeriod=30d&use_cache=1&utc=true'>)
08:55:03 [ERROR] django.request: Internal Server Error: /api/0/organizations/xxxxxxxxx-sentry/events-stats/ (status_code=500 request=<WSGIRequest: GET '/api/0/organizations/xxxxxxxxx-sentry/events-stats/?environment=release&interval=1h&partial=1&project=4&query=message%3AREGISTER_DEVICE_FINGERPRINT_FAILED%20AND%20message%3AxxxxxxxxxException%20AND%20device.online%3ATrue&referrer=api.discover.default-chart&statsPeriod=30d&yAxis=count%28%29'>)
08:55:03 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.organization_events.OrganizationEventsEndpoint' response=500 user_id='4' is_app='False' token_type='None' is_frontend_request='True' organization_id='1' auth_id='None' path='/api/0/organizations/xxxxxxxxx-sentry/events/' caller_ip='x.x.x.x' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:123.0) Gecko/20100101 Firefox/123.0' rate_limited='False' rate_limit_category='None' request_duration_seconds=13.369595050811768 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:03 [ERROR] django.request: Internal Server Error: /api/0/organizations/xxxxxxxxx-sentry/events/ (status_code=500 request=<WSGIRequest: GET '/api/0/organizations/xxxxxxxxx-sentry/events/?environment=release&field=count%28%29&per_page=50&project=4&query=message%3AREGISTER_DEVICE_FINGERPRINT_FAILED%20AND%20message%3AxxxxxxxxxException%20AND%20device.online%3ATrue&referrer=api.discover.query-table&statsPeriod=30d'>)
08:55:03 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.organization_measurements_meta.OrganizationMeasurementsMeta' response=500 user_id='4' is_app='False' token_type='None' is_frontend_request='True' organization_id='1' auth_id='None' path='/api/0/organizations/xxxxxxxxx-sentry/measurements-meta/' caller_ip='x.x.x.x' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:123.0) Gecko/20100101 Firefox/123.0' rate_limited='False' rate_limit_category='None' request_duration_seconds=7.447220087051392 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:03 [ERROR] django.request: Internal Server Error: /api/0/organizations/xxxxxxxxx-sentry/measurements-meta/ (status_code=500 request=<WSGIRequest: GET '/api/0/organizations/xxxxxxxxx-sentry/measurements-meta/?project=4&statsPeriod=30d'>)
08:55:05 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.artifact_lookup.ProjectArtifactLookupEndpoint' response=200 user_id='None' is_app='None' token_type='system' is_frontend_request='False' organization_id='1' auth_id='None' path='/api/0/projects/xxxxxxxxx-sentry/internal/artifact-lookup/' caller_ip='10.207.16.139' user_agent='symbolicator/23.11.2' rate_limited='False' rate_limit_category='None' request_duration_seconds=0.04927492141723633 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:05 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.artifact_lookup.ProjectArtifactLookupEndpoint' response=200 user_id='None' is_app='None' token_type='system' is_frontend_request='False' organization_id='1' auth_id='None' path='/api/0/projects/xxxxxxxxx-sentry/internal/artifact-lookup/' caller_ip='10.207.16.139' user_agent='symbolicator/23.11.2' rate_limited='False' rate_limit_category='None' request_duration_seconds=0.05196881294250488 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:09 [INFO] sentry.access.api: api.access (method='POST' view='sentry.api.endpoints.relay.project_configs.RelayProjectConfigsEndpoint' response=200 user_id='None' is_app='None' token_type='None' is_frontend_request='False' organization_id='None' auth_id='None' path='/api/0/relays/projectconfigs/' caller_ip='10.207.16.188' user_agent='None' rate_limited='False' rate_limit_category='None' request_duration_seconds=0.017090320587158203 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connectionpool.py", line 398, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connection.py", line 239, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/local/lib/python3.8/http/client.py", line 1256, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.8/http/client.py", line 1302, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.8/http/client.py", line 1251, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.8/http/client.py", line 1011, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.8/http/client.py", line 951, in send
    self.connect()
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connection.py", line 205, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x79b85d837910>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 1008, in _legacy_snql_query
    result = _raw_snql_query(request, Hub(thread_hub), headers)
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 1035, in _raw_snql_query
    return _snuba_pool.urlopen(
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connectionpool.py", line 815, in urlopen
    return self.urlopen(
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connectionpool.py", line 815, in urlopen
    return self.urlopen(
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connectionpool.py", line 815, in urlopen
    return self.urlopen(
  [Previous line repeated 2 more times]
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 384, in increment
    return super().increment(
  File "/usr/local/lib/python3.8/site-packages/sentry/../urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='xxxxxxxxx-prod-sentry-sg1-snuba', port=1218): Max retries exceeded with url: /events/snql (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x79b85d837910>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/sentry/api/base.py", line 231, in handle_exception
    response = super().handle_exception(exc)
  File "/usr/local/lib/python3.8/site-packages/rest_framework/views.py", line 469, in handle_exception
    self.raise_uncaught_exception(exc)
  File "/usr/local/lib/python3.8/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
    raise exc
  File "/usr/local/lib/python3.8/site-packages/sentry/api/base.py", line 355, in dispatch
    response = handler(request, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/sentry/api/endpoints/project_events.py", line 71, in get
    return self.paginate(
  File "/usr/local/lib/python3.8/site-packages/sentry/api/base.py", line 448, in paginate
    cursor_result = paginator.get_result(
  File "/usr/local/lib/python3.8/site-packages/sentry/api/paginator.py", line 507, in get_result
    data = self.data_fn(offset=offset, limit=limit + 1)
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/metrics.py", line 215, in inner
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/sentry/eventstore/snuba/backend.py", line 174, in get_events
    return self.__get_events(
  File "/usr/local/lib/python3.8/site-packages/sentry/eventstore/snuba/backend.py", line 277, in __get_events
    result = snuba.aliased_query(
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 1234, in aliased_query
    return _aliased_query_impl(**kwargs)
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 1238, in _aliased_query_impl
    return raw_query(**aliased_query_params(**kwargs))
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 780, in raw_query
    return bulk_raw_query([snuba_params], referrer=referrer, use_cache=use_cache)[0]
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 847, in bulk_raw_query
    return _apply_cache_and_build_results(params, referrer=referrer, use_cache=use_cache)
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 881, in _apply_cache_and_build_results
    query_results = _bulk_snuba_query([item[1] for item in to_query], headers)
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 932, in _bulk_snuba_query
    query_results = [query_fn((snuba_param_list[0], Hub(Hub.current), headers, parent_api))]
  File "/usr/local/lib/python3.8/site-packages/sentry/utils/snuba.py", line 1010, in _legacy_snql_query
    raise SnubaError(err)
sentry.utils.snuba.SnubaError: HTTPConnectionPool(host='xxxxxxxxx-prod-sentry-sg1-snuba', port=1218): Max retries exceeded with url: /events/snql (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x79b85d837910>: Failed to establish a new connection: [Errno 111] Connection refused'))
08:55:12 [INFO] sentry.access.api: api.access (method='GET' view='sentry.api.endpoints.project_events.ProjectEventsEndpoint' response=500 user_id='49' is_app='False' token_type='api_token' is_frontend_request='False' organization_id='1' auth_id='None' path='/api/0/projects/xxxxxxxxx-sentry/python-django/events/' caller_ip='x.x.x.x user_agent='python-requests/2.31.0' rate_limited='False' rate_limit_category='None' request_duration_seconds=0.13088011741638184 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:12 [ERROR] django.request: Internal Server Error: /api/0/projects/xxxxxxxxx-sentry/python-django/events/ (status_code=500 request=<WSGIRequest: GET '/api/0/projects/xxxxxxxxx-sentry/python-django/events/?statsPeriod=5m'>)
08:55:19 [INFO] sentry.access.api: api.access (method='POST' view='sentry.api.endpoints.relay.project_configs.RelayProjectConfigsEndpoint' response=200 user_id='None' is_app='None' token_type='None' is_frontend_request='False' organization_id='None' auth_id='None' path='/api/0/relays/projectconfigs/' caller_ip='10.207.16.188' user_agent='None' rate_limited='False' rate_limit_category='None' request_duration_seconds=0.019531726837158203 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
08:55:29 [INFO] sentry.access.api: api.access (method='POST' view='sentry.api.endpoints.relay.project_configs.RelayProjectConfigsEndpoint' response=200 user_id='None' is_app='None' token_type='None' is_frontend_request='False' organization_id='None' auth_id='None' path='/api/0/relays/projectconfigs/' caller_ip='10.207.16.188' user_agent='None' rate_limited='False' rate_limit_category='None' request_duration_seconds=0.018665313720703125 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')```
===============

snuba-api: 

10.207.62.52 - - [17/May/2024:09:50:00 +0000] "POST /generic_metrics/snql HTTP/1.1" 200 609 "dynamic_sampling.counters.get_org_transaction_volumes" "python-urllib3/1.26.11"
2024-05-17 09:50:00,153 Allocation policy failed to get quota allowance, this is a bug, fix it
Traceback (most recent call last):
  File "/usr/src/snuba/./snuba/query/allocation_policies/__init__.py", line 707, in get_quota_allowance
    allowance = self._get_quota_allowance(tenant_ids, query_id)
  File "/usr/src/snuba/./snuba/query/allocation_policies/concurrent_rate_limit.py", line 230, in _get_quota_allowance
    rate_limit_params, overrides = self._get_rate_limit_params(tenant_ids)
  File "/usr/src/snuba/./snuba/query/allocation_policies/concurrent_rate_limit.py", line 197, in _get_rate_limit_params
    tenant_key, tenant_value = self._get_tenant_key_and_value(tenant_ids)
  File "/usr/src/snuba/./snuba/query/allocation_policies/concurrent_rate_limit.py", line 186, in _get_tenant_key_and_value
    raise AllocationPolicyViolation(
snuba.query.allocation_policies.AllocationPolicyViolation: Queries must have a project id or organization id, explanation: {}
2024-05-17 09:50:00,179 Allocation policy failed to update quota balance, this is a bug, fix it
Traceback (most recent call last):
  File "/usr/src/snuba/./snuba/query/allocation_policies/__init__.py", line 751, in update_quota_balance
    return self._update_quota_balance(tenant_ids, query_id, result_or_error)
  File "/usr/src/snuba/./snuba/query/allocation_policies/concurrent_rate_limit.py", line 244, in _update_quota_balance
    rate_limit_params, _ = self._get_rate_limit_params(tenant_ids)
  File "/usr/src/snuba/./snuba/query/allocation_policies/concurrent_rate_limit.py", line 197, in _get_rate_limit_params
    tenant_key, tenant_value = self._get_tenant_key_and_value(tenant_ids)
  File "/usr/src/snuba/./snuba/query/allocation_policies/concurrent_rate_limit.py", line 186, in _get_tenant_key_and_value
    raise AllocationPolicyViolation(
snuba.query.allocation_policies.AllocationPolicyViolation: Queries must have a project id or organization id, explanation: {}
10.207.16.180 - - [17/May/2024:09:50:00 +0000] "POST /generic_metrics/snql HTTP/1.1" 200 837 "dynamic_sampling.distribution.fetch_projects_with_count_per_root_total_volumes" "python-urllib3/1.26.11"
10.207.15.203 - - [17/May/2024:09:50:00 +0000] "POST /generic_metrics/snql HTTP/1.1" 200 3433 "dynamic_sampling.counters.fetch_projects_with_count_per_transaction_volumes" "python-urllib3/1.26.11"
10.207.15.203 - - [17/May/2024:09:50:00 +0000] "POST /generic_metrics/snql HTTP/1.1" 200 755 "dynamic_sampling.counters.fetch_projects_with_transaction_totals" "python-urllib3/1.26.11"
10.207.15.204 - - [17/May/2024:09:50:02 +0000] "POST /search_issues/snql HTTP/1.1" 200 7525 "search_sample" "python-urllib3/1.26.11"
10.207.15.204 - - [17/May/2024:09:50:02 +0000] "POST /discover/snql HTTP/1.1" 200 9788 "search_sample" "python-urllib3/1.26.11"
===============
azaslavsky commented 5 months ago

How's the memory and CPU usage on your instance when this occurs? Its possible that the snuba server is rejecting these connections due to memory issues.

sree-warrier commented 5 months ago

There are no resource issues seen, memory and cpu usages are fine. However have allotted more resources to this services and tried but still seeing failures. Seems something to do with timeout values and unable to find the exact config for it.

azaslavsky commented 5 months ago

I'm not familiar with all of the snuba settings myself. I'll forward to that team to see if anyone there has more insight.

xurui-c commented 5 months ago

Could you provide a link to the dashboard? It seems like your queries are missing a project/organization id, which could be causing the connection error

mcannizz commented 1 month ago

@xurui-c should allocation policy violations be occurring in self-hosted? This looks like a bug to me.

sree-warrier commented 1 month ago

@xurui-c its a selfhosted sentry. Also seeing this issue regularly now. Seems some config changes need to be done with snuba. Let us know if any such configuration been referred

nikhars commented 1 month ago

Hi. The snuba config which disables the allocation policy on self hosted can be found here. You can see that the allocation policy is marked as False there. Hence it shouldn't be active.

  1. Can you validate what value you are seeing in the configuration file of your environment? Is that value set to True or False?
  2. Do you have a mechanism where you override certain values in your environment? If so, can you turn the option to False and see if you still see the issue?
getsantry[bot] commented 1 week ago

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you remove the label Waiting for: Community, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀