open-telemetry / opentelemetry-js

OpenTelemetry JavaScript Client
https://opentelemetry.io
Apache License 2.0
2.73k stars 794 forks source link

Tests failing in tip of master #1776

Closed dradetsky closed 3 years ago

dradetsky commented 3 years ago

What version of OpenTelemetry are you using?

the tip of master. Or a2304c9a82b77c762ae5df0f8f8621b4e2ab5ecb

What version of Node are you using?

% node -v
v15.4.0

Please provide the code you used to setup the OpenTelemetry SDK

npm install && npm run compile && npm test

What did you do?

What did you expect to see?

The tests passing.

What did you see instead?

  awsEksDetector
    on successful request
      ✓ should return an aws_eks_instance_resource
      ✓ should return a resource with clusterName attribute without cgroup file
      ✓ should return a resource with container ID attribute without a clusterName
      ✓ should return a resource with clusterName attribute when cgroup file does not contain valid Container ID
      ✓ should return an empty resource when not running on Eks
      ✓ should return an empty resource when k8s token file does not exist
      ✓ should return an empty resource when containerId and clusterName are invalid
    on unsuccesful request
      ✓ should throw when receiving error response code
      ✓ should return an empty resource when timed out
      1) should return an empty resource when timed out

  25 passing (1s)
  1 failing

  1) awsEksDetector
       on unsuccesful request
         should return an empty resource when timed out:
     Uncaught Error: Failed to load page, status code: 404
      at IncomingMessage.<anonymous> (src/detectors/AwsEksDetector.ts:79:55)
      at IncomingMessage.emit (node:events:376:20)
      at endReadableNT (node:internal/streams/readable:1295:12)
      at processTicksAndRejections (node:internal/process/task_queues:80:21)
      at runNextTicks (node:internal/process/task_queues:62:3)
      at processImmediate (node:internal/timers:436:9)

Additional context

  1. the "error" in the test is exactly the expected error we see in the test file
const expectedError = new Error('Failed to load page, status code: 404');

which suggests to me that either:

0.a. this is not how you're actually supposed to do assert-exception-is-thrown in this test framework, or

0.b. the tests have some non-automatic & undocumented dependency that i don't have (e.g. the author of the package has a more recent version of mocha on his personal machine, and that's what runs for him rather than the version the package explicitly depends on).

  1. If I attempt to work around this by doing
-    it('should return an empty resource when timed out', async () => {
+    xit('should return an empty resource when timed out', async () => {

In spite of the fact that I had only 1 error above, I now get

  awsEksDetector
    on successful request
      ✓ should return an aws_eks_instance_resource
      ✓ should return a resource with clusterName attribute without cgroup file
      ✓ should return a resource with container ID attribute without a clusterName
      ✓ should return a resource with clusterName attribute when cgroup file does not contain valid Container ID
      ✓ should return an empty resource when not running on Eks
      ✓ should return an empty resource when k8s token file does not exist
      ✓ should return an empty resource when containerId and clusterName are invalid
    on unsuccesful request
      ✓ should throw when receiving error response code
      - should return an empty resource when timed out

  24 passing (1s)
  1 pending

----------|---------|----------|---------|---------|-------------------
File      | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
----------|---------|----------|---------|---------|-------------------
All files |       0 |        0 |       0 |       0 |                   
 index.ts |       0 |        0 |       0 |       0 |                   
----------|---------|----------|---------|---------|-------------------

lerna ERR! npm run test stderr:
/home/dmr/job/repo/bukkitz/vendor/opentelemetry-js/packages/opentelemetry-resource-detector-aws/node_modules/mocha/lib/runner.js:911
    throw err;
    ^

Error: EKS metadata api request timed out.

which is again the error we were trying to expect, not to trigger.

dyladan commented 3 years ago

We currently only test and officially support LTS versions of node. We also test on 8. Do you have a requirement to use node 15?

We should definitely fix this issue though and the test is actually broken. On my machine the detection never throws:

    it('should return an empty resource when timed out', async () => {
      const expectedError = new Error('Failed to load page, status code: 404');
      fileStub = sandbox
        .stub(AwsEksDetector, 'fileAccessAsync' as any)
        .resolves();
      readStub = sandbox
        .stub(AwsEksDetector, 'readFileAsync' as any)
        .resolves(correctCgroupData);
      getCredStub = sandbox
        .stub(awsEksDetector, '_getK8sCredHeader' as any)
        .resolves(k8s_token);
      const scope = nock('https://' + K8S_SVC_URL)
        .persist()
        .get(AUTH_CONFIGMAP_PATH)
        .matchHeader('Authorization', k8s_token)
        .reply(404, () => new Error());

      let thrown = false;
      try {
        await awsEksDetector.detect({
          logger: new NoopLogger(),
        });
      } catch (err) {
        thrown = true;
        assert.deepStrictEqual(err, expectedError);
      }

      assert.ok(thrown); // <= this assertion fails on my machine (v14)

      scope.done();
    });
dyladan commented 3 years ago

The name also mentions timeouts but the test doesn't appear to deal with any timing

dradetsky commented 3 years ago

@dyladan strictly speaking, we only have a hard requirement to use a version of node that supports async_hooks, so LTS should be okay. However, the package in question (and all the other packages I've seen) have

  "engines": {
    "node": ">=8.0.0"
  }

I studied math in college & I can confirm that 15.4.0 >= 8.0.0.

dradetsky commented 3 years ago

I'm not saying that's a bug or that you need to change your engines declarations. But a user could be forgiven for assuming you support 15.x.x releases.

dradetsky commented 3 years ago

@dyladan you said

The name also mentions timeouts but the test doesn't appear to deal with any timing

I'm guessing what's going on is that the above error is what occurs when there's a timeout in eks. It's not that the test involves timing, he's just having it trigger the error which would be triggered by a timeout immediately.

What (I'm guessing) the test is about is making sure that if this error occurs, the resource which we detect isn't in some incorrect garbage state, like if it got half the info & then timed out, the result should be empty, not with half the fields filled out. Or something like that (I don't really know the resource detector specifically).