vercel / next.js

The React Framework
https://nextjs.org
MIT License
126.79k stars 26.95k forks source link

The `with-opentelemetry` example doesn't send telemetry data #61975

Open YunosukeY opened 8 months ago

YunosukeY commented 8 months ago

Verify canary release

Provide environment information

yarn run v1.22.19
$ /Users/yamada/work/samples/with-opentelemetry-app/node_modules/.bin/next info

Operating System:
  Platform: darwin
  Arch: arm64
  Version: Darwin Kernel Version 23.0.0: Fri Sep 15 14:42:57 PDT 2023; root:xnu-10002.1.13~1/RELEASE_ARM64_T8112
Binaries:
  Node: 18.17.0
  npm: 9.6.7
  Yarn: 1.22.19
  pnpm: N/A
Relevant Packages:
  next: 14.1.1-canary.51 // Latest available version is detected (14.1.1-canary.51).
  eslint-config-next: N/A
  react: 18.2.0
  react-dom: 18.2.0
  typescript: 4.7.4
Next.js Config:
  output: N/A

✨  Done in 1.72s.

Which example does this report relate to?

with-opentelemetry

What browser are you using? (if relevant)

Microsoft Edge for Business Version 121.0.2277.112 (Official build) (arm64)

How are you deploying your application? (if relevant)

yarn dev

Describe the Bug

The example doesn't send telemetry data.

This looks like a @vercel/otel bug. I confirmed by adding @vercel/otel to my Next.js application without the example and it didn't send telemetry data as well. https://nextjs.org/docs/app/building-your-application/optimizing/open-telemetry#using-vercelotel

I also confirmed that telemetry data is sent with manual configuration. In this case default spans are sent. https://nextjs.org/docs/app/building-your-application/optimizing/open-telemetry#manual-opentelemetry-configuration

Expected Behavior

The example send telemetry data with @vercel/otel.

To Reproduce

  1. Launch an OTel collector using opentelemetry-collector-dev-setup as a preparation.

    git clone https://github.com/vercel/opentelemetry-collector-dev-setup.git
    cd opentelemetry-collector-dev-setup
    docker-compose up -d
  2. Create and launch an application using the example.

    yarn create next-app --example with-opentelemetry with-opentelemetry-app
    cd with-opentelemetry-app
    yarn dev
  3. Open localhost:3000 in a browser

  4. Check Jaeger in the browser.

YunosukeY commented 8 months ago

I resolved the problem by calling registerOTel as follows.

import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import { SimpleSpanProcessor } from "@opentelemetry/sdk-trace-node";
import { registerOTel } from "@vercel/otel";

export function register() {
  registerOTel({
    serviceName: "next-app",
    spanProcessors: [new SimpleSpanProcessor(new OTLPTraceExporter())],
  });
}

Is this a @vercel/otel issue? Or is it a documentation/example issue?

sendmenas commented 8 months ago

I can confirm that I had the same issue, it does not work as described in setup documentation. @YunosukeY solution worked.

iaraknes commented 8 months ago

Setting the following env vars worked for me.

    OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "http://jaeger-hostname:4318/v1/traces"
    OTEL_EXPORTER_OTLP_TRACES_PROTOCOL: "http/protobuf"

I also found the following env-variable to be immensely useful:

OTEL_LOG_LEVEL=debug
lingyan commented 8 months ago

Thanks @iaraknes! In my case, I was able to add the following env var and that worked for me:

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

I'm surprised this is not used by default, as documented in: https://opentelemetry.io/docs/specs/otel/protocol/exporter/#configuration-options

Also, manual instrumentation with @opentelemetry libraries works fine without this env variable. It's only @vercel/otel requiring this env var to be set.

tansanDOTeth commented 7 months ago

Setting the following env vars worked for me.

    OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "http://jaeger-hostname:4318/v1/traces"
    OTEL_EXPORTER_OTLP_TRACES_PROTOCOL: "http/protobuf"

I also found the following env-variable to be immensely useful:

OTEL_LOG_LEVEL=debug

Which project did you set this in? the collector or the nextjs app?

tansanDOTeth commented 7 months ago

I resolved the problem by calling registerOTel as follows.

import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import { SimpleSpanProcessor } from "@opentelemetry/sdk-trace-node";
import { registerOTel } from "@vercel/otel";

export function register() {
  registerOTel({
    serviceName: "next-app",
    spanProcessors: [new SimpleSpanProcessor(new OTLPTraceExporter())],
  });
}

Is this a @vercel/otel issue? Or is it a documentation/example issue?

Do you have a screenshot on what to expect via jaeger UI? I'm not seeing anything wth the serviceName, so I'm unsure if I just set it up incorrectly

iaraknes commented 7 months ago

Setting the following env vars worked for me.

    OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "http://jaeger-hostname:4318/v1/traces"
    OTEL_EXPORTER_OTLP_TRACES_PROTOCOL: "http/protobuf"

I also found the following env-variable to be immensely useful:

OTEL_LOG_LEVEL=debug

Which project did you set this in? the collector or the nextjs app?

I set it in the nextjs app.

tansanDOTeth commented 7 months ago

I was following the official documentation here: https://nextjs.org/docs/pages/building-your-application/optimizing/open-telemetry#testing-your-instrumentation

It didn't work with any newer versions. Only @vercel/otel: "^0.3.0" has correctly sent traces to the docker app. Every other version has not worked for me. I'm unsure why, but I hope this saves everyone else their time and sanity.

tansanDOTeth commented 7 months ago

Setting the following env vars worked for me.

    OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "http://jaeger-hostname:4318/v1/traces"
    OTEL_EXPORTER_OTLP_TRACES_PROTOCOL: "http/protobuf"

I also found the following env-variable to be immensely useful:

OTEL_LOG_LEVEL=debug

Which project did you set this in? the collector or the nextjs app?

I set it in the nextjs app.

Ah okay! I figured it out, so I was using this docker-compose from Vercel (https://github.com/vercel/opentelemetry-collector-dev-setup) and it exposes it on localhost. I changed it to these values and the newer version worked! Thank you!

OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="http://localhost:4318/v1/traces"
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL="http/protobuf"
OTEL_LOG_LEVEL=debug
Gawdfrey commented 7 months ago

I was expecting more data reported in jaeger ui, but I am a noob at instrumentation.

image

Are you getting more data reported in your setup? I get one span everytime i boot up the application, but no more data is added even though I click around in my app.

tansanDOTeth commented 7 months ago

I was expecting more data reported in jaeger ui, but I am a noob at instrumentation. image

Are you getting more data reported in your setup? I get one span everytime i boot up the application, but no more data is added even though I click around in my app.

Which version of next are u using? 13.4+ has a lot more spans

Gawdfrey commented 7 months ago

I am on 14.1.4 😅 I will try to see if I am able to send some custom spans. EDIT: That worked instantly, but not the other default spans that nextjs says they have 🤔

tansanDOTeth commented 7 months ago

I am on 14.1.4 😅 I will try to see if I am able to send some custom spans. EDIT: That worked instantly, but not the other default spans that nextjs says they have 🤔

I should note: the default spans I experienced were in the page router

Gawdfrey commented 7 months ago
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import { Resource } from "@opentelemetry/resources";
import { NodeSDK } from "@opentelemetry/sdk-node";
import { SimpleSpanProcessor } from "@opentelemetry/sdk-trace-node";
import { SEMRESATTRS_SERVICE_NAME } from "@opentelemetry/semantic-conventions";

const sdk = new NodeSDK({
  resource: new Resource({
    [SEMRESATTRS_SERVICE_NAME]: "my-app-name",
  }),
  spanProcessors: [new SimpleSpanProcessor(new OTLPTraceExporter())],
});

Annoyingly using the normal SDK does not report more than 1 span either 😢 And yes, I am using the app router, so there probably is some underlying issue.

sudowoodo200 commented 7 months ago

Tried to follow @YunosukeY 's solution, but got this. Does anyone else experience this?

 ⨯ ../node_modules/.pnpm/@opentelemetry+otlp-exporter-base@0.49.1_@opentelemetry+api@1.8.0/node_modules/@opentelemetry/otlp-exporter-base/build/esm/platform/browser/OTLPExporterBrowserBase.js (49:0) @ OTLPTraceExporter.OTLPExporterBrowserBase [as constructor]
 ⨯ An error occurred while loading instrumentation hook: navigator is not defined`
sudowoodo200 commented 7 months ago

Tried to follow @YunosukeY 's solution, but got this. Does anyone else experience this?

 ⨯ ../node_modules/.pnpm/@opentelemetry+otlp-exporter-base@0.49.1_@opentelemetry+api@1.8.0/node_modules/@opentelemetry/otlp-exporter-base/build/esm/platform/browser/OTLPExporterBrowserBase.js (49:0) @ OTLPTraceExporter.OTLPExporterBrowserBase [as constructor]
 ⨯ An error occurred while loading instrumentation hook: navigator is not defined`

It's being caused by the middleware

plogistik commented 7 months ago

Tried to follow @YunosukeY 's solution, but got this. Does anyone else experience this?

 ⨯ ../node_modules/.pnpm/@opentelemetry+otlp-exporter-base@0.49.1_@opentelemetry+api@1.8.0/node_modules/@opentelemetry/otlp-exporter-base/build/esm/platform/browser/OTLPExporterBrowserBase.js (49:0) @ OTLPTraceExporter.OTLPExporterBrowserBase [as constructor]
 ⨯ An error occurred while loading instrumentation hook: navigator is not defined`

It's being caused by the middleware

Yes I'm experiencing the same behaviour. Have you found a solution @sudowoodo200?

Juulaps commented 7 months ago

Tried to follow @YunosukeY 's solution, but got this. Does anyone else experience this?

 ⨯ ../node_modules/.pnpm/@opentelemetry+otlp-exporter-base@0.49.1_@opentelemetry+api@1.8.0/node_modules/@opentelemetry/otlp-exporter-base/build/esm/platform/browser/OTLPExporterBrowserBase.js (49:0) @ OTLPTraceExporter.OTLPExporterBrowserBase [as constructor]
 ⨯ An error occurred while loading instrumentation hook: navigator is not defined`

It's being caused by the middleware

Yes I'm experiencing the same behaviour. Have you found a solution @sudowoodo200?

I assume this is the case because the instrumentation is running as middleware in the edge runtime. Which doesn't have access to these Node API's. It is also discussed in this issue: https://github.com/vercel/next.js/issues/59413#issuecomment-1877744304.

GabeInDevOps commented 6 months ago

Setting the following env vars worked for me.

    OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "http://jaeger-hostname:4318/v1/traces"
    OTEL_EXPORTER_OTLP_TRACES_PROTOCOL: "http/protobuf"

I also found the following env-variable to be immensely useful:

OTEL_LOG_LEVEL=debug

Which project did you set this in? the collector or the nextjs app?

I set it in the nextjs app.

Ah okay! I figured it out, so I was using this docker-compose from Vercel (https://github.com/vercel/opentelemetry-collector-dev-setup) and it exposes it on localhost. I changed it to these values and the newer version worked! Thank you!

OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="http://localhost:4318/v1/traces"
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL="http/protobuf"
OTEL_LOG_LEVEL=debug

I can also confirm this fixed the issue for me. Thank you!

What I don't understand, is why this works when running the Next app outside of Docker. When I run it with next dev in my terminal, I don't need to set these environment variables.

foot commented 2 months ago

We had issues sending custom spans, which seemed to be resolved by aligning the @opentelemetry/api version across the project in package.json

package.json (using overrides for npm)

  "overrides": {
    "@opentelemetry/api": "^1.9.0"
  }

instrumentation.ts

export async function register() {
    registerOTel({
        serviceName: "my-app",
    });
}

Then custom spans started appearing alongside the base ones using trace.getTracer('nextjs-example').startActiveSpan etc

suniastar commented 2 months ago

So I have the same issue as everyone here. My register looks like this:

export function register() {
  registerOTel({
    serviceName: 'rh-micro-apps',
    instrumentationConfig: {
      fetch: {
        propagateContextUrls: ['example.com'],
      },
    },
    propagators: ['tracecontext', 'baggage'],
    traceSampler: 'always_on',
  })
}

and I used these environment variables for local development and testing:

OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
OTEL_EXPORTER_OTLP_PROTOCOL="grpc"

This setup works with a locally running jaeger all in one and it seems the env vars are registered correctly because I can see inside the jaeger logs that the spans/traces come in via GRPC. I was also able to send them to a remote machine so they definitely work during development.

But once I build a production container I will not receive any traces under any circumstances whatsoever.

Thanks to @iaraknes I found out about the OTEL_LOG_LEVEL flag an I noticed that the extension seems to ignore some otel environment variables in production containers.

They work for local next dev and even next build & next start but once running in the production container both env vars seem to be ignored.

The debug log seem to indicate that spans are still send to localhost even though I have set the correct endpoint and protocol via env vars on container runtime and again even during build to investigate further:

_spanProcessor: t {
  _spanProcessors: [
  oJ {
  processors: [
  <ref *17> e {
  _exporter: ie {
  impl: <ref *1> ir {
  _sendingPromises: [],
  url: 'http://localhost:4318/v1/traces',
  shutdown: [Function: bound shutdown],
  _shutdownOnce: r {
  _callback: [Function: _shutdown],
  _that: [Circular *1],
  _isCalled: false,
  _deferred: e {
  _resolve: [Function],
  _reject: [Function],
  _promise: Promise {
  [Symbol(async_id_symbol)]: 124,
  [Symbol(trigger_async_id_symbol)]: 117,
  [Symbol(kResourceStore)]: <ref *16> e {
  _currentContext: Map(2) {
  Symbol(next.rootSpanId) => 0,
  Symbol(OpenTelemetry Context Key SPAN) => <ref *3> o {
  attributes: {
  next.span_name: 'GET /redacted/url',
  next.span_type: 'BaseServer.handleRequest',
  http.method: 'GET',
  http.target: '/redacted/url',
  next.rsc: false,
  http.status_code: 200,
  next.bubble: true,
  operation.name: 'next_js.BaseServer.handleRequest'
},

The Dockerfile looks like this (which is heavily based on https://github.com/vercel/next.js/blob/main/examples/with-docker/Dockerfile):

FROM node:lts-alpine AS base

FROM base AS builder
WORKDIR /app
COPY . .
RUN corepack enable && yarn install
RUN yarn build

FROM base AS runner
WORKDIR /app
ENV NODE_ENV production
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 nextjs
COPY --from=builder /app/public ./public
RUN mkdir .next
RUN chown nextjs:nodejs .next
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static
USER nextjs
EXPOSE 3000
ENV PORT 3000
ENV HOSTNAME "0.0.0.0"
CMD ["node", "server.js"]
suniastar commented 2 months ago

So after a bit more digging around I found out GRPC is actually not supported. If you are using a slightly invalid configuration it will fallback to default only indicated if you turn on OTEL_LOG_LEVEL=debug

@opentelemetry/api: Registered a global for diag v1.9.0.
@vercel/otel: Configure propagator: tracecontext
@vercel/otel: Configure propagator: baggage
@vercel/otel: Configure sampler:  always_on
@vercel/otel: Configure trace exporter:  grpc http://localhost:4317/v1/traces headers: <none>
@vercel/otel: Unsupported OTLP traces protocol: grpc. Using http/protobuf.

Also do not use the full path for OTEL_EXPORTER_OTLP_ENDPOINT like http://localhost:4318/v1/traces. Only specify the endpoint until the port like: http://localhost:4318. The Path will be appended automatically even if it is already specfied resulting in a wrong config and missing traces.

All of this is not visible with the default log level so maybe it would be a good idea to log something like this in warning message instead of debug.

fadomire commented 2 weeks ago

is it possible to add support for grpc in nodejs env ?