digitalbazaar / jsonld.js

A JSON-LD Processor and API implementation in JavaScript
https://json-ld.org/
Other
1.66k stars 195 forks source link

Resolving remote context: "Dereferencing a URL" error [Regression >= 5.0.0] #451

Open gregid opened 3 years ago

gregid commented 3 years ago

After upgrading to jsonld >= 5.0.0 (away from using request library anywhere) I've got the following error:

    jsonld.InvalidUrl: Dereferencing a URL did not result in a valid JSON-LD object. Possible causes are an inaccessible URL perhaps due to a same-origin policy (ensure the server uses CORS if you are using client-side JavaScript), too many redirects, a non-JSON response, or more than one HTTP Link Header was provided for a remote context.

      at ContextResolver._fetchContext (node_modules/jsonld/lib/ContextResolver.js:173:13)
      at ContextResolver._resolveRemoteContext (node_modules/jsonld/lib/ContextResolver.js:117:34)
      at ContextResolver.resolve (node_modules/jsonld/lib/ContextResolver.js:50:22)
      at Object.<anonymous>.api.process (node_modules/jsonld/lib/context.js:65:20)
      at Function.jsonld.compact (node_modules/jsonld/lib/jsonld.js:172:21)

for the following jest test:

const jsonld = require('jsonld')

const context = [
  'https://www.w3.org/ns/activitystreams',
  'https://w3id.org/security/v1'
]
const obj = { 
  '@id': 'https://example.org/personID',
  '@type': 'https://www.w3.org/ns/activitystreams#Person',
  'https://w3id.org/security#publicKey': {
    '@id': 'https://example.org/personID#main-key',
    'https://w3id.org/security#owner': 'https://example.org/personID'
  }
}
const opts = {
  compactArrays: false,
  documentLoader: jsonld.documentLoaders.node()
}

it('jsonld.compact activitystreams+security [promise]', function () {
  expect.assertions(1);

  return jsonld.compact(obj, context, opts).then((result) => {
    // console.log('result:', JSON.stringify(result, null, ' '))
    expect(result).toStrictEqual({
      '@context': [
        'https://www.w3.org/ns/activitystreams',
        'https://w3id.org/security/v1'
      ],
      '@graph': [
        {
          id: 'https://example.org/personID',
          type: 'Person',
          publicKey: [
            {
              id: "https://example.org/personID#main-key",
              "sec:owner": [ "https://example.org/personID",],
            },
          ],
        }
      ]
    })
  }).catch((error) => {
    console.error(error)
    throw error
  })
})

This test passes in jsonld versions 3.x and 4.x *

The comment in https://github.com/digitalbazaar/jsonld.js/blob/master/lib/documentLoaders/node.js#L180 suggests that it now considers redirects as errors. If that's the case it limits the usability of the jsonld.documentLoaders.node() It seems this is just a comment on ky not the loader itself as the rest of the code suggest it should deal with redirects.

* with some caveats related to the multiple use of request in jest: immers-space/activitypub-express/issues/40 which was the main motivator for upgrade

Nebulis commented 3 years ago

Hi,

I faced the same issue and dug into the code to find that the following error is thrown:

TypeError: httpClient.get is not a function

I noticed that httpClient is a node proxy and for some reason the proxy does not work with jest >= 25. Jest 24.9.0 is fine.

Nebulis commented 3 years ago

There is probably a compatibility issue somewhere with esm library

dmitrizagidulin commented 3 years ago

Hi @gregid, @Nebulis -- you're absolutely right, the issue turns out to be with the esm library; it has problems working with Jest (and some React Native bundlers). We're investigating a way to move away from it for this reason.

gregid commented 3 years ago

@Nebulis @dmitrizagidulin thanks for looking into this!

Nebulis commented 2 years ago

Hi,

Would you consider moving back and not use an ESM only library? So far this change broke almost everywhere in our codebase:

I know that many people in the community (me included) are excited about ESM. I even tried to update our libraries and support ESM only. But turns out it was a mistake and the community is not ready for that.

I wrote about every issue I encountered with ESM in our libraries => https://github.com/Open-Attestation/open-attestation/pull/197

The community is simply not ready for ESM only libraries.

Right now, I consider forking your library and get rid of the issue. But to be fair if I can avoid the burden of maintaining a fork, I would be more than happy.

inukshuk commented 2 years ago

I'm running into the same error, but in my case this is definitely caused because of redirects and not an issue related to ESM or packaging. I'm posting this here first, just in case, but if the OP is indeed caused by packaging, this is a separate issue.

Having said that, I'm running into the error when the node documentLoader encounters a redirect. This is because ky, or ultimately fetch, is called with redirect set to manual mode. This causes an opaqueredirect response when a redirect is encountered; notably, this response has a status of zero and no body or headers. Specifically, there is no Location header, which the redirect handling in documentLoader seems to depend on. I can only speculate, but to me it looks like a wrong assumption in the document loader's redirect handling about the 'redirect manual' parameter.

Edit: Just to confirm, if I set redirect to follow the error is fixed (but bypassing the redirect code in document loader, since ky/fetch follows the redirect).

fatadel commented 1 year ago

Is there any update for this issue? Any workarounds or fixes? jsonld versions < 5 have vulnerabilities related to xmldom dependencies. At the same time, we cannot update jsonld to versions >= 5 because of this issue.

Steps to reproduce in our case:

  1. Take this W3C Thing Description (which is essentially a JSON-LD document)
const td = {
  "id":"urn:simple",
  "@context":"https://www.w3.org/2022/wot/td/v1.1",
  "title":"MyLampThing",
  "description":"Valid TD copied from the spec's first example",
  "securityDefinitions":{
    "basic_sc":{
      "scheme":"basic",
      "in":"header"
    },
    "basic_scd":{
      "scheme":"basic",
      "in":"header"
    }
  },
  "security":[
    "basic_sc"
  ],
  "properties":{
    "status":{
      "type":"string",
      "forms":[
        {
          "href":"https://mylamp.example.com/status"
        }
      ]
    }
  },
  "actions":{
    "toggle":{
      "forms":[
        {
          "href":"https://mylamp.example.com/toggle"
        }
      ]
    },
    "toggled":{
      "forms":[
        {
          "href":"https://mylamp.example.com/toggle"
        }
      ]
    }
  },
  "events":{
    "overheating":{
      "data":{
        "type":"string"
      },
      "forms":[
        {
          "href":"https://mylamp.example.com/oh",
          "subprotocol":"longpoll"
        }
      ]
    }
  }
}
  1. Run jsonld.toRDF(td, {format: 'application/nquads'})

  2. And now the interesting thing - step 2 runs successfully in a "usual" node environment but fails with jest (we use version 28.1.3). The error we get is

> jsonld.InvalidUrl: Dereferencing a URL did not result in a valid JSON-LD object. Possible causes are an inaccessible URL perhaps due to a same-origin policy (ensure the server uses CORS if you are using client-side JavaScript), too many redirects, a non-JSON response, or more than one HTTP Link Header was provided for a remote context.
fatadel commented 1 year ago

Is there any update for this issue? Any workarounds or fixes? jsonld versions < 5 have vulnerabilities related to xmldom dependencies. At the same time, we cannot update jsonld to versions >= 5 because of this issue.

Steps to reproduce in our case:

  1. Take this W3C Thing Description (which is essentially a JSON-LD document)
const td = {
  "id":"urn:simple",
  "@context":"https://www.w3.org/2022/wot/td/v1.1",
  "title":"MyLampThing",
  "description":"Valid TD copied from the spec's first example",
  "securityDefinitions":{
    "basic_sc":{
      "scheme":"basic",
      "in":"header"
    },
    "basic_scd":{
      "scheme":"basic",
      "in":"header"
    }
  },
  "security":[
    "basic_sc"
  ],
  "properties":{
    "status":{
      "type":"string",
      "forms":[
        {
          "href":"https://mylamp.example.com/status"
        }
      ]
    }
  },
  "actions":{
    "toggle":{
      "forms":[
        {
          "href":"https://mylamp.example.com/toggle"
        }
      ]
    },
    "toggled":{
      "forms":[
        {
          "href":"https://mylamp.example.com/toggle"
        }
      ]
    }
  },
  "events":{
    "overheating":{
      "data":{
        "type":"string"
      },
      "forms":[
        {
          "href":"https://mylamp.example.com/oh",
          "subprotocol":"longpoll"
        }
      ]
    }
  }
}
  1. Run jsonld.toRDF(td, {format: 'application/nquads'})
  2. And now the interesting thing - step 2 runs successfully in a "usual" node environment but fails with jest (we use version 28.1.3). The error we get is
> jsonld.InvalidUrl: Dereferencing a URL did not result in a valid JSON-LD object. Possible causes are an inaccessible URL perhaps due to a same-origin policy (ensure the server uses CORS if you are using client-side JavaScript), too many redirects, a non-JSON response, or more than one HTTP Link Header was provided for a remote context.

I didn't notice that xmldom-related vulnerability issues were already solved in jsonld version 4, so we've updated to that one (which works just fine). However, this jest-related issue still refrains us from updating to versions >= 5.

davidlehn commented 1 year ago

One problem seems to be in lib/documentLoaders/node.js, at the bottom where _fetch() catches errors and sets a cause: e. That error is somehow dropped when it pops up the stack. So you don't even see the real problem, just the wrapper. I didn't figure that issue out yet, but if you just print the error there, then Jest tells you what is wrong:

Error: You need to run with a version of node that supports ES Modules in the VM API. See https://jestjs.io/docs/ecmascript-modules  
        at invariant (/.../node_modules/jest-runtime/build/index.js:2448:11)
        at importModuleDynamically (/.../node_modules/jest-runtime/build/index.js:1683:11)
        ...

I didn't read https://jestjs.io/docs/ecmascript-modules beyond the suggestion to try a node option, and with that the original test code above worked fine (adjust to your local testing setup):

NODE_OPTIONS=--experimental-vm-modules npm t
davidlehn commented 1 year ago

Seems the error chain is too deep and just prints out some deep details as [Object]. You can see the error in the test catch handler by using something like this instead:

    console.error(require('util').inspect(error, {depth: 100}));                 
marcvanandel commented 1 year ago

This can be reproduced by altering the node-document-loader-tests.js:83 and add .status(304) to the response:

_app.get('/data/:id', _store, (req, res) => {
  res.status(304).json(_data[req.params.id].content);
});

To me the issue is with the handling of http status codes in lib/documentLoaders/node.js:127 where a 304 is handled similarly as a 302 ... but a 304 means 'not modified on the server side' (see mozilla http 304 documentation)

This 'bug' (feature?) leads to an issue with loading the json-ld context of the ed25519-signature-2020 because this returns a 304 as well 😏

davidlehn commented 1 year ago

@marcvanandel I think your comment may be mixing issues? The other chat here is more related to ESM usage. You might want to create a new issue just for redirection issues.

It does look like the default node document loader could use some improvements to better handle different result codes. I'm not sure how 304 would ever be returned since the loader is not sending cache related headers.

inukshuk commented 1 year ago

I think my comment above might have been lost in the noise here, but there is definitely an issue with the document loader that is not related to ESM usage. As is, redirects will simply not work.

marcvanandel commented 1 year ago

I've created #499 for the handling of 304s better

tobiasschweizer commented 1 year ago

One problem seems to be in lib/documentLoaders/node.js, at the bottom where _fetch() catches errors and sets a cause: e. That error is somehow dropped when it pops up the stack. So you don't even see the real problem, just the wrapper. I didn't figure that issue out yet, but if you just print the error there, then Jest tells you what is wrong:

Error: You need to run with a version of node that supports ES Modules in the VM API. See https://jestjs.io/docs/ecmascript-modules  
        at invariant (/.../node_modules/jest-runtime/build/index.js:2448:11)
        at importModuleDynamically (/.../node_modules/jest-runtime/build/index.js:1683:11)
        ...

I didn't read https://jestjs.io/docs/ecmascript-modules beyond the suggestion to try a node option, and with that the original test code above worked fine (adjust to your local testing setup):

NODE_OPTIONS=--experimental-vm-modules npm t

Hi there,

Any updates on this? Were you able to find a working setup with jest and jsonld? We started having this issue (jest and ESM) when switching from jsonld 5.2.0 to 8.1.0. We had another issue with jest when using node v18.14.2 (because of the jsonld 5.2.0 dependency node-fetch 3.0.0-beta.9).

I remember that I faced an issue with jsonld before but I am not sure if this could be related.

Any updates or explanations are highly appreciated. Thanks a lot!