getsentry / self-hosted

Sentry, feature-complete and packaged up for low-volume deployments and proofs-of-concept
https://develop.sentry.dev/self-hosted/
Other
7.78k stars 1.76k forks source link

Session Replays not visible at all (self-hosted) - not sent? #2846

Open appcoders opened 7 months ago

appcoders commented 7 months ago

Self-Hosted Version

24.2.0

CPU Architecture

x86_64

Docker Version

25.0.3

Docker Compose Version

2.24.6

Steps to Reproduce

  1. Installed SDK on React
    Sentry.init({
        dsn: brandParameters.sentry.dsn,
        integrations: [
            Sentry.reactRouterV6BrowserTracingIntegration({
                useEffect: React.useEffect,
                useLocation,
                useNavigationType,
                createRoutesFromChildren,
                matchRoutes,
            }),
            Sentry.replayIntegration(),
        ],
        environment,
        tracePropagationTargets: brandParameters.sentry.tracingOrigins,
        debug: true
        tracesSampleRate: 1.0,
        replaysSessionSampleRate: 1.0,
        replaysOnErrorSampleRate: 1.0,
    });
  2. Called Website with React App
  3. Doing stuff on website
  4. Waiting that data is visible in Replay Sessions on Sentry Web

Expected Result

Viewable Replay Sessions on Web

Actual Result

Browser Console log:

Sentry Logger [info]: [Replay] Loading existing session
Sentry Logger [info]: [Replay] Starting replay in session mode
Sentry Logger [info]: [Replay] Using compression worker
Sentry Logger [info]: [Replay] Pausing replay
Sentry Logger [warn]: [Replay] Received replay event after session expired.
Sentry Logger [error]: [Replay] Attempting to finish replay event after session expired.

Event ID

No response

appcoders commented 7 months ago

Addendum:

New Browser Session: logs at 17:47

17:32:21.269 Sentry Logger [log]: Integration installed: Replay
17:32:21.293 Sentry Logger [info]: [Replay] Loading existing session
17:32:21.293 Sentry Logger [info]: [Replay] Starting replay in session mode
17:32:21.293 Sentry Logger [info]: [Replay] Using compression worker
17:38:29.620 Sentry Logger [info]: [Replay] Pausing replay
17:38:29.621 Sentry Logger [warn]: [Replay] Received replay event after session expired.

clicked in window

17:47:49.794 Sentry Logger [info]: [Replay] Document has become active, but session has expired
17:47:49.894 Sentry Logger [info]: [Replay] Stopping Replay triggered by refresh session
17:47:49.895 Sentry Logger [info]: [Replay] Destroying compression worker
17:47:49.895 Sentry Logger [info]: [Replay] Creating new session
17:47:49.895 Sentry Logger [info]: [Replay] Starting replay in session mode
17:47:49.896 Sentry Logger [info]: [Replay] Using compression worker

What happens to the session if the browser is closed? When is the session flushed? Is there a flush event logged?

The database is empty

postgres=# select * from replays_replayrecordingsegment;
 id | project_id | replay_id | file_id | sequence_id | date_added | size
----+------------+-----------+---------+-------------+------------+------
(0 rows)

postgres=# select * from feedback_feedback;
 id | project_id | replay_id | url | message | feedback_id | date_added | data | organization_id | environment_id
----+------------+-----------+-----+---------+-------------+------------+------+-----------------+----------------
(0 rows)

Browser sends data:

{"event_id":"5c544d52eba643bbaf2d25a4c21e7b4e","sent_at":"2024-02-29T16:36:09.449Z","sdk":{"name":"sentry.javascript.react","version":"7.103.0"}}
{"type":"replay_event"}
{"type":"replay_event","replay_start_timestamp":1709224340.872,"timestamp":1709224569.442,"error_ids":[],"trace_ids":[],"urls":[],"replay_id":"5c544d52eba643bbaf2d25a4c21e7b4e","segment_id":8,"replay_type":"session","request":{"url":"https://redacted","headers":{"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:124.0) Gecko/20100101 Firefox/124.0"}},"event_id":"5c544d52eba643bbaf2d25a4c21e7b4e","environment":"production","release":"1.0.0","sdk":{"integrations":["InboundFilters","FunctionToString","TryCatch","Breadcrumbs","GlobalHandlers","LinkedErrors","Dedupe","HttpContext","BrowserTracing","Replay"],"name":"sentry.javascript.react","version":"7.103.0"},"user":{"id":redacted,"username":"Markus Schicker","email":"markus@redacted"},"platform":"javascript"}
{"type":"replay_recording","length":199}
{"segment_id":8}

and gets a 200 with an ID back:

{"id":"5c544d52eba643bbaf2d25a4c21e7b4e"}

Docker logs:

sentry-self-hosted-snuba-replays-consumer-1:

2024-02-29 15:57:05,732 Initializing Snuba...
2024-02-29 15:57:07,623 Snuba initialization took 1.8911404259997653s
2024-02-29 15:57:07,973 Initializing Snuba...
2024-02-29 15:57:09,872 Snuba initialization took 1.8995200659992406s
2024-02-29 15:57:09,879 Consumer Starting
2024-02-29 15:57:09,880 Checking Clickhouse connections...
2024-02-29 15:57:09,885 Successfully connected to Clickhouse: cluster_name=None
2024-02-29 15:57:09,885 librdkafka log level: 6
2024-02-29 15:57:11,143 New partitions assigned: {Partition(topic=Topic(name='ingest-replay-events'), index=0): 0}

sentry-self-hosted-ingest-replay-recordings-1:

Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
sentry/requirements.txt is deprecated, use sentry/enhance-image.sh - see https://github.com/getsentry/self-hosted#enhance-sentry-image
15:41:12 [INFO] arroyo.processing.processor: New partitions assigned: {Partition(topic=Topic(name='ingest-replay-recordings'), index=0): 0}

sentry-self-hosted-kafka-1:

===> User
uid=0(root) gid=0(root) groups=0(root)
===> Configuring ...
===> Running preflight checks ...
===> Check if /var/lib/kafka/data is writable ...
===> Check if Zookeeper is healthy ...
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.5.8-f439ca583e70862c3068a1f2a7d4d068eec33315, built on 05/04/2020 15:53 GMT
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:host.name=ca4a6c63181a
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.8.0_222
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Azul Systems, Inc.
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/zulu-8-amd64/jre
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/etc/confluent/docker/docker-utils.jar
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA>
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.version=5.15.0-97-generic
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.name=root
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.home=/root
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.memory.free=236MB
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.memory.max=3554MB
[main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.memory.total=240MB
[main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=zookeeper:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher@65b3120a
[main] INFO org.apache.zookeeper.common.X509Util - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
[main] INFO org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 4194304 Bytes
[main] INFO org.apache.zookeeper.ClientCnxn - zookeeper.request.timeout value is 0. feature enabled=
[main-SendThread(zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server zookeeper/172.19.0.2:2181. Will not attempt to authenticate using SASL (unknown error)
[main-SendThread(zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established, initiating session, client: /172.19.0.11:55520, server: zookeeper/172.19.0.2:2181
[main-SendThread(zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server zookeeper/172.19.0.2:2181, sessionid = 0x10005d482010000, negotiated timeout = 40000
[main] INFO org.apache.zookeeper.ZooKeeper - Session: 0x10005d482010000 closed
[main-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x10005d482010000
===> Launching ...
===> Launching kafka ...
[2024-02-29 15:40:22,279] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
[2024-02-29 15:40:22,626] WARN The package io.confluent.support.metrics.collectors.FullCollector for collecting the full set of support metrics could not be loaded, so we are reverting to anonymous, basic metric collection. If you are a Confluent customer, please refer to the Confluent Platform documentation, section Proactive Support, on how to activate full metrics collection. (io.confluent.support.metrics.KafkaSupportConfig)
[2024-02-29 15:40:22,626] WARN The support metrics collection feature ("Metrics") of Proactive Support is disabled. (io.confluent.support.metrics.SupportedServerStartable)
[2024-02-29 15:40:23,595] INFO Starting the log cleaner (kafka.log.LogCleaner)
[2024-02-29 15:40:23,635] INFO [kafka-log-cleaner-thread-0]: Starting (kafka.log.LogCleaner)
[2024-02-29 15:40:23,791] INFO Awaiting socket connections on 0.0.0.0:9092. (kafka.network.Acceptor)
[2024-02-29 15:40:23,811] INFO [SocketServer brokerId=1001] Created data-plane acceptor and processors for endpoint : EndPoint(0.0.0.0,9092,ListenerName(PLAINTEXT),PLAINTEXT) (kafka.network.SocketServer)
[2024-02-29 15:40:23,811] INFO [SocketServer brokerId=1001] Started 1 acceptor threads for data-plane (kafka.network.SocketServer)
[2024-02-29 15:40:23,888] INFO Creating /brokers/ids/1001 (is it secure? false) (kafka.zk.KafkaZkClient)
[2024-02-29 15:40:23,900] INFO Stat of the created znode at /brokers/ids/1001 is: 3678,3678,1709221223895,1709221223895,1,0,0,72064004310237185,180,0,3678
 (kafka.zk.KafkaZkClient)
[2024-02-29 15:40:23,900] INFO Registered broker 1001 at path /brokers/ids/1001 with addresses: ArrayBuffer(EndPoint(kafka,9092,ListenerName(PLAINTEXT),PLAINTEXT)), czxid (broker epoch): 3678 (kafka.zk.KafkaZkClient)
[2024-02-29 15:40:24,066] INFO [/config/changes-event-process-thread]: Starting (kafka.common.ZkNodeChangeNotificationListener$ChangeEventProcessThread)
[2024-02-29 15:40:24,186] INFO [SocketServer brokerId=1001] Started data-plane processors for 1 acceptors (kafka.network.SocketServer)
appcoders commented 7 months ago

After testing with Firefox instead of Chromium, some sessions come through. There are no (visible) errors on Chromium, so what could be the problem with data transmission?

csvan commented 7 months ago

Possibly related: https://github.com/getsentry/self-hosted/issues/2841

hubertdeng123 commented 7 months ago

Using Firefox instead of Chromium shouldn't pose an issue AFAIK. Do you have certain adblockers/network filtering rules on your Chromium browser?

mariusnergard commented 6 months ago

Having the same issue on Self-Hosted Version 24.3.0

azaslavsky commented 6 months ago

Do the sentry-web container logs show any errors when attempting to upload a replay in this manner?

getsantry[bot] commented 5 months ago

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you remove the label Waiting for: Community, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox πŸ₯€

kraegpoeth commented 5 months ago

I have a similar issue: I have content in the clickhouse database:

oot@sentry-clickhouse-0:/# clickhouse-client
ClickHouse client version 21.8.13.6 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 21.8.13 revision 54449.

sentry-clickhouse :) SELECT count(*)γ€€FROM replays_local

SELECT count(*)
FROM replays_local

Query id: ec86cbda-3ecb-40f2-abc9-a24ced1752f8

β”Œβ”€count()─┐
β”‚      20 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

1 rows in set. Elapsed: 0.038 sec. 

But nothing in postgress!:

sentry=# select count(*) from replays_replayrecordingsegment;
 count 
-------
     0
(1 row)

I see the replays in the sentry dashboard, but when i want to "watch them" i get "Replay not found"-screen

image

I have reinstall sentry v24.2.0 on a fresh install from this chart: https://artifacthub.io/packages/helm/sentry/sentry Everything else in sentry is running fine!

hubertdeng123 commented 5 months ago

@kraegpoeth you are using an unsupported version of self-hosted here, so we'll be unable to help you. However, it is a huge red flag that nothing is appearing in postgres, as that is where the replay blobs are stored. Clickhouse simply stores the replay event metadata.

kraegpoeth commented 5 months ago

@hubertdeng123 Thank for replying - yeah i know, my client needs to host on premise in order to allow for the use of sentry. I know i can't get support - but do you know where and how i should look to figure out why postgres table is empty? Would greatly appreciate just any pointers from the top of your head πŸ™ I postgres receiving data directly from the replay ingest consumer or how is the dataflow?

kraegpoeth commented 5 months ago

I think it could maybe be related to this error? https://github.com/getsentry/self-hosted/issues/2171 I also have a bunch of "ERROR: duplicate key value violates unique constraint "sentry_grouprelease_group_id_release_id_envi" errors in my postgres logs. Could maybe be because i restoring via sentry backup global export/import? Because i do have all the quickfix onboarding steps visiable after doing that after a clean install.. Will try to make a new sentry install without recovering low-volume data via the backup import cli and see if that is the culprit of why replays are not appearing in postgres.

UPDATE: Tried a completely fresh install - still the same issue. no errors in sentry-ingest-replay-recordings nor sentry-snuba-replays-consumer. I have 27 items in clickhouse replay table but nothing in postgres. :(

hubertdeng123 commented 5 months ago

I'd advise you to take this up with the repo that uses the helm charts setup of self-hosted. The duplicate key value violates unique constraint should just be warnings that are not indicative of an actual error, as these should be handled properly by Sentry.

wellerchen commented 5 months ago

I have a similar issue: I have content in the clickhouse database:

oot@sentry-clickhouse-0:/# clickhouse-client
ClickHouse client version 21.8.13.6 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 21.8.13 revision 54449.

sentry-clickhouse :) SELECT count(*)γ€€FROM replays_local

SELECT count(*)
FROM replays_local

Query id: ec86cbda-3ecb-40f2-abc9-a24ced1752f8

β”Œβ”€count()─┐
β”‚      20 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

1 rows in set. Elapsed: 0.038 sec. 

But nothing in postgress!:

sentry=# select count(*) from replays_replayrecordingsegment;
 count 
-------
     0
(1 row)

I see the replays in the sentry dashboard, but when i want to "watch them" i get "Replay not found"-screen image

I have reinstall sentry v24.2.0 on a fresh install from this chart: https://artifacthub.io/packages/helm/sentry/sentry Everything else in sentry is running fine!

Hello, is it settled? I have the same problem

StrangeWill commented 5 months ago

Having the same problem as above, clickhouse has metadata, postgres has nothing, no immediately obvious log outputs on docker, version 24.5.0.dev0, will continue digging.

Edit: I'm going to throw something, replay finally has data after fighting with it (and tunnel configs, moving away from rust consumers, etc.) for a bit, looks like I'm good, ignore me.

dievri commented 4 months ago

facing the same issue with Sentry v24.2.0 installed with helm chart https://artifacthub.io/packages/helm/sentry/sentry

sentry-clickhouse :) SELECT count(*)γ€€FROM replays_local

SELECT count(*)
FROM replays_local

Query id: 9bffed26-a264-40ca-8d00-c0d99e173baf

β”Œβ”€count()─┐
β”‚  761698 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

1 rows in set. Elapsed: 0.003 sec. 

But postgres is empty

sentry=#  select * from replays_replayrecordingsegment;
 id | project_id | replay_id | file_id | sequence_id | date_added | size 
----+------------+-----------+---------+-------------+------------+------
(0 rows)
hubertdeng123 commented 4 weeks ago

We don't officially support helm charts deployments of self-hosted Sentry.

K-Jean commented 2 weeks ago

Hello ! I'm facing the same issue with sentry self-hosted without helm chart

root@7733e2e5c4d3:/# clickhouse-client
ClickHouse client version 23.8.11.29.altinitystable (altinity build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 23.8.11 revision 54465.

Warnings:
 * Linux transparent hugepages are set to "always". Check /sys/kernel/mm/transparent_hugepage/enabled

7733e2e5c4d3 :) SELECT count(*)γ€€FROM replays_local

SELECT count(*)
FROM replays_local

Query id: 98347788-042f-4d64-a26e-13aaeb82f63d

β”Œβ”€count()─┐
β”‚   76904 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

1 row in set. Elapsed: 0.003 sec. 

and postgresql :

root@9297663d7f19:/# psql -U postgres
psql (14.11 (Debian 14.11-1.pgdg120+2))
Type "help" for help.

sentry=# select * from replays_replayrecordingsegment;
 id | project_id | replay_id | file_id | sequence_id | date_added | size 
----+------------+-----------+---------+-------------+------------+------
(0 rows)