open-telemetry / opentelemetry-js-contrib

OpenTelemetry instrumentation for JavaScript modules
https://opentelemetry.io
Apache License 2.0
709 stars 531 forks source link

mysql2 instrumentation always includes values in db.statement #1758

Open dyllandry opened 1 year ago

dyllandry commented 1 year ago

I'm not confident this is a bug, and I'm unsure what behaviour I should expect by default, so I'm asking this as an open question/discussion first to clarify.

I've just setup @opentelemetry/auto-instrumentations-node and can view traces created the mysql2 auto-instrumentation. I've also noticed it's included the entire database statement in every statement. Is this intended default behaviour for the mysql2 instrumentation?

We are using sequelize as an orm, we aren't writing any of our own sql statements. I know static statements are not sanitized, but I assume sequelize is using parameterized statements too.

I saw in the mysql (not mysql2) docs that there's an enhancedDatabaseReporting configuration option that is supposed to add query parameters to db.statement. This makes me think that the parameters shouldn't be there by default, but that is a different mysql module after all. That option doesn't exist for the mysql2 instrumentation, so maybe this module does add them by default.

macno commented 11 months ago

Even if I see the utility to have the full statement available in the span, I think it should not be enable by default, for at least a couple of reasons:

  1. values can be sensitive
  2. values size

beside how the statement is generated (by an ORM or by hand) I'd expect to see the placeholders and not the values.

i.e. executing the following script, db.statement should be insert into test_blob (data) values (?)

const sdkNode = require('@opentelemetry/sdk-node');
const exporterOtel = require('@opentelemetry/exporter-trace-otlp-grpc');
const resources = require('@opentelemetry/resources');
const instrumentationMysql2 = require('@opentelemetry/instrumentation-mysql2');
const api = require('@opentelemetry/api');
const fs = require('fs/promises');
const sdk = new sdkNode.NodeSDK({
  traceExporter: new exporterOtel.OTLPTraceExporter(),
  instrumentations: [new instrumentationMysql2.MySQL2Instrumentation()],
  resourceDetectors: [resources.envDetector]
});

sdk.start();

const mysql = require('mysql2/promise');

const tracer = api.trace.getTracer('mysql2-test');

let connection;
async function run() {
  connection = await mysql.createConnection({
    host: process.env.FW_DB_HOST,
    user: process.env.FW_DB_USER,
    password: process.env.FW_DB_PASSWORD,
    database: process.env.FW_DB_NAME
  });
  await connection.execute(
    'create table if not exists test_blob (id int unsigned not null auto_increment primary key, data blob not null )'
  );
  const sql = 'insert into test_blob (data) values (?)';
  const data = await fs.readFile(__filename, 'utf8');
  const phs = [data];
  await connection.execute(sql, phs);
}

tracer.startActiveSpan('db test', async span => {
  run()
    .then(() => console.log('done'))
    .catch(console.error)
    .finally(() => {
      span.end();
      return connection.close();
    })
    .then(() => {
      return sdk.shutdown();
    });
});
dyladan commented 11 months ago

If you're using the parameterized API the values should not be in the statement. This is unintended behavior that should be fixed.