DataDog / dd-trace-rb

Datadog Tracing Ruby Client
https://docs.datadoghq.com/tracing/
Other
306 stars 375 forks source link

Handling "payload too large" because of huge INSERT SQLs created by activerecord-import #1750

Open mizukami234 opened 2 years ago

mizukami234 commented 2 years ago

Hi, I'm using activerecord integration. I looked into some missing traces and finally I found that those traces contain huge INSERT SQL (nearly 5MiB) created by activerecord-import gem and had been dropped as "payload too large".

I tried partial_flush option, but this did not resolve this issue. It still says the same message in debug mode. Isn't it a solution to handle this issue?

Or, is there any other solution? (I'm looking for something like silencing feature because I don't need to view this SQL's shape from datadog UI)

PS

my environment ddtrace 0.53.0 rails 6.1.4.1 ruby2.7

delner commented 2 years ago

Quick theory off the top of my head: the SQL instrumentation tags the query on the span, and that tag exceeds a size limit as its sent to the agent, hence the error.

Perhaps the answer is to put some kind of size limit on SQL query tags, or more generally tags?

mizukami234 commented 2 years ago

@delner

Perhaps the answer is to put some kind of size limit on SQL query tags, or more generally tags?

I don't know how these ways potentially affect to other traces, but both look good for me now.

marcotc commented 2 years ago

@mizukami234, to aid with our discussion, would you know long this query actually is (in number bytes or characters)?

mizukami234 commented 2 years ago

@marcotc

1MiB ~ 10MiB in my case. It easily grows bigger.

My simplified debug log shown below. Resource field is extremely huge! The VALUE clause length per a record is nearly 160 bytes where the number of records would be 5000 ~ 50000. Potentiall grows by O(Column Size * Record Size).

myapp-sidekiq-1  |  Name: postgres.query
myapp-sidekiq-1  | Span ID: 1430865000297426775
myapp-sidekiq-1  | Parent ID: 3737102368397088761
myapp-sidekiq-1  | Trace ID: 4195826364147550648
myapp-sidekiq-1  | Type: sql
myapp-sidekiq-1  | Service: myapp-postgres
myapp-sidekiq-1  | Resource: INSERT INTO "mytable" ("column1", ...,"columnN") VALUES (value11, ..., value1N), ..., (valueM1, ..., valueMN) RETURNING "id"
myapp-sidekiq-1  | Error: 0
myapp-sidekiq-1  | Start: 1636275862504337664
myapp-sidekiq-1  | End: 1636275862606366976
myapp-sidekiq-1  | Duration: 0.10204320799994093
myapp-sidekiq-1  | Allocations: 1382
myapp-sidekiq-1  | Tags: [
myapp-sidekiq-1  |    env => development,
myapp-sidekiq-1  |    peer.service => myapp-postgres,
myapp-sidekiq-1  |    active_record.db.vendor => postgres,
myapp-sidekiq-1  |    active_record.db.name => myapp_base,
myapp-sidekiq-1  |    out.host => mydbhost]
myapp-sidekiq-1  | Metrics: [ out.port => 5432.0],
sco11morgan commented 2 years ago

@mizukami234 I don't see any tracing in activerecord-import. Perhaps you have the Datadog mysql2 integration configured? It does trace SQL.

nvm. I missed your example above that had postgres.query.

Perhaps the answer is to put some kind of size limit on SQL query tags, or more generally tags?

@delner Had a good idea with limiting the SQL. What's a good limit, 4k?

sco11morgan commented 2 years ago

Other developers expect 200k SQL payloads, so we'll have to make truncation optional.

https://github.com/DataDog/dd-trace-rb/issues/1878

tonybruess commented 9 months ago

I'm seeing this as well