open-telemetry / opentelemetry-js

OpenTelemetry JavaScript Client
https://opentelemetry.io
Apache License 2.0
2.6k stars 750 forks source link

Service Worker Spec #1214

Open BraunreutherA opened 4 years ago

BraunreutherA commented 4 years ago

Is your feature request related to a problem? Please describe.

Hi there, we're using service workers and cloudflare workers (which implement the service worker spec). Sadly the browser build accesses APIs which are not available in workers. For example the window.crypto library or the performance API. Would it be possible to create a platform bundle for workers?

Describe the solution you'd like

I would like to use a worker specific bundle which won't use browser or node.js related APIs.

Describe alternatives you've considered

I tried to use the WebTracer, the NodeJs Tracer. Both don't work - this wouldn'T be the biggest Problem as I can use the BasicTracingProvider and manually trace and build instrumentation for workers later on. But still the Core library relies on NodeJs and Browser specific libraries - so no version is ready to use.

I can help with developing the bundle - but I first need to know if it's in the scope of the library.

bernielomax commented 3 years ago

I too just encountered this. It would be amazing if support could be added for service workers.

dyladan commented 3 years ago

The window.crypto library is removed in the latest build as far as I'm aware. We still depend on the performance API to gather accurate timing data. Maybe we could factor out a Clock interface that implements the now method and use a performance based implementation as the default. This would solve the issue without the need to publish an entire new build for service workers.

@obecny any thoughts on this?

BraunreutherA commented 3 years ago

That'd be awesome! Cloudflare workers also don't support performance apis sadly but maybe there could be a fallback to Date.now() as a last instance - it's better to know what happens on a millisecond base than not knowing anything! :)

sodabrew commented 3 years ago

Note that Cloudflare's Date.now() returns the same for all calls servicing a given query in order to prevent timing attacks. Blog post with more information: https://blog.cloudflare.com/mitigating-spectre-and-other-security-threats-the-cloudflare-workers-security-model/

sodabrew commented 3 years ago

window.crypto is implemented a bit differently, maybe close enough to be compatible with a small shim? https://developers.cloudflare.com/workers/runtime-apis/web-crypto

jpettit commented 3 years ago

Any update on this?

dyladan commented 3 years ago

No update. The tight constraints of service workers, particularly their security considerations, make it very difficult to implement tracing. The performance API is removed specifically to disallow accurate timings. If someone was willing to contribute a solution or even a proposal I would be happy to consider it.

Grunet commented 2 years ago

Fwiw I've learned that Deno Deploy added support for the Performance API and the other usual timing mechanisms, but limited their resolution to mitigate the security concerns (more discussion around that in this issue and to contrast, Cloudflare's reasoning behind their opposite choice also lives here in addition the blog post shared above)

And perhaps tangential, but I was able to (sort of) do a proof of viability with otel-js on Deno Deploy (more details in this comment) that at least gave me more confidence that worker runtimes outside of the browser have the potential to work with OpenTelemetry.

dyladan commented 2 years ago

We've done some work recently to enable web workers which I suspect improved this situation. Not sure if there is anywhere that the work still todo on that is documented

legendecas commented 2 years ago

The sdk-trace-base and sdk-trace-web on the main branch (not yet published) should work in the worker environments since https://github.com/open-telemetry/opentelemetry-js/pull/2719 was landed. However, the web instrumentation (like fetch and xhr) and exporters are still focused on the main-frame context (by using document-based APIs), which may not work as expected. These will need a load of work.

jamesarosen commented 2 years ago

Is there a list of things left to do for service-worker support?

legendecas commented 2 years ago

Is there a list of things left to do for service-worker support?

We don't have a concrete list of to-do yet.

Please note that since Environments like Cloudflare workers are disabling precise clocks (even Date.now() is frozen), the timing may still not work as expected even though the OpenTelemetry SDKs are compatible with Worker API. This may affect Span creation time, end time, and durations.

RichiCoder1 commented 2 years ago

I'm interested in this, and it's worth noting that Date.now() is more accurate than you'd think as it's frozen during code execution and will still progress normally during I/O operations (e.g. subrequests) when talking about Cloudflare Workers.

So it's still possible to capture the full requests execution time as long as a significant portion of the workers execution time is spent doing I/O, which I'd argue is the normal case for bounded workers.

askoufis commented 2 years ago

We've hacked around the fact that performance isn't available within cloudflare workers with a little ESBuild plugin. It replaces the built version of this file with a shim that seems to give decent duration measurements for I/O operations.

const performanceShim = `
  export const otperformance = {
    now: () => Date.now(),
    timeOrigin: Date.now()
  }
`;

const perfFixEsbuildPlugin = {
  name: 'fix-opentel-timing',
  setup(build) {
    build.onLoad(
      {
        filter:
          /@opentelemetry\/core\/build\/esm\/platform\/browser\/performance.js/,
      },
      () => ({
        contents: performanceShim,
      }),
    );
  },
};
RichiCoder1 commented 2 years ago

Is the above something that otel would take as an official contribution? I'd love to get otel working in edge environments.

legendecas commented 2 years ago

@RichiCoder1 you are welcome to contribute to the project. Please refer to the CONTRIBUTING doc for how to submit patches to the project.

RichiCoder1 commented 2 years ago

Just about every "edge" runtime is planning to provide a perf API except Cloudflare Workers, so I think the route I'm actually going to take here is a polyfill the Performance API for workers, as well as create an (initially external, poc) CF Workers Trace SDK. There's significantly enough difference between CF workers and "web" that it doesn't make sense to overload the web-trace-sdk for that need.

RichiCoder1 commented 1 year ago

This is much belated, but would this project be amenable to adding a workerd (e.g. Cloudflare) export platform implementation?

With a recent change they've made, it should be possible to create an implementation of certain core primitives that works with Cloudflare (workerd).

More details on the export key: https://github.com/cloudflare/wrangler2/releases/tag/wrangler%402.9.0

Only downside is this would be adding potentially redundantt service specific code to a core package, and I'm not sure if this project is fully setup to handle aspected builds.

Would be a great way to eliminate the hacks used above and here

More context: https://runtime-keys.proposal.wintercg.org/#workerd