w3c / json-ld-syntax

JSON-LD 1.1 Specification
https://w3c.github.io/json-ld-syntax/
Other
109 stars 38 forks source link

JSON-LD Context similarity to XML External Entity attack #421

Open hmottestad opened 8 months ago

hmottestad commented 8 months ago

An XML External Entity attack is a type of attack against an application that parses XML input. This attack occurs when XML input containing a reference to an external entity is processed by a weakly configured XML parser. This attack may lead to the disclosure of confidential data, denial of service, server side request forgery, port scanning from the perspective of the machine where the parser is located, and other system impacts.

JSON-LD supports referencing a context stored online or on the local file system. This seems to allow for the same attack surface as with XML External Entity.

See: https://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing

hmottestad commented 8 months ago

Also posted here: https://github.com/json-ld/json-ld.org/issues/825

davidlehn commented 8 months ago

What is the request here? Did you want more spec text about this sort of issue? Or some sort of test coverage? If so, can you open a PR with a suggestion?

Practically speaking, it may be difficult to construct a real attack with @context or @import. You'd have to know the implementation was vulnerable to this issue, point to a resource that could be parsed as JSON, and that JSON must be interpreted as a context, and somehow it would process data in a nefarious way and return it. I had trouble coming up with a theoretical situation where this was more than a DoS attack when I wrote up https://github.com/digitalbazaar/jsonld-cli#security-considerations. I did add some related scheme restrictions to that tool just in case. I'm not saying this isn't an issue, but I'd like to see an example of a real attack. Even if it was movie plot style and not realistic.

hmottestad commented 8 months ago

I was just thinking about this while implementing support for JSON-LD 1.1 in RDF4J. I know it's an issue when processing XML files and that the recommendation is to disable XXE altogether.

The only way I know that the XXE attack can be used to disclose confidential data is where the contents of the data are included in an error message. I would assume that the same would be the case for (some) JSON-LD parsers.

For RDF4J I was considering disabling fetching of remote/local contexts and imports as the default behaviour, but I would assume that would not make us compliant with the specification.

gkellogg commented 8 months ago

This may relate to #108. Most of the issues considered about remote context access relate to the need to reduce network bandwidth and to be able to verify that the result conforms to expectations. The availability of a document loader mitigates the danger of accessing inappropriate resources or potentially getting back injected results, and is important in some contexts.

hmottestad commented 8 months ago

I've been pointed towards the DocumentLoader interface as the place to implement the security controls.

VladimirAlexiev commented 8 months ago

@davidlehn I've been talking to Havard. Initially I was complacent and said "this would be the fault of the JSONLD processor", "that would be the fault of a sysadmin" etc. But now I think his concerns are quite real. Some bad URLs as described in XXE can cause damage, even if they don't return JSON.

@gkellogg I guess the issue is to come up with some best practices (good options and defaults) to be used by a security-conscious documentLoader, and if such arise, to add wording to the spec to point to them.

When I'm home, I'll cite the relevant sections from the Syntax and API specs.