APIDevTools / swagger-parser

Swagger 2.0 and OpenAPI 3.0 parser/validator
https://apitools.dev/swagger-parser
MIT License
1.11k stars 156 forks source link

Insecure Default Resolver Behavior #222

Open keymandll opened 2 years ago

keymandll commented 2 years ago

I have noticed that by default the library (tested with version 10.1.0) is willing to resolve just any files pointed to by $ref. This can be a problem if the library is used by something that:

  1. Is a backend service
  2. Processes OpenAPI documents from untrusted sources, AND
  3. Performs actual requests based on the processed OpenAPI document

For example, given the below OpenAPI document

openapi: 3.0.2
info:
  title: Swagger Petstore - OpenAPI 3.0
  version: 1.0.11
servers:
  - url: /api/v3
paths:
  /pet:
    put:
      summary: Update an existing pet
      description: CHECK THE PET OBJECT leak PROPERTY!
      operationId: updatePet
      requestBody:
        description: Update an existent pet in the store
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Pet'
        required: true
components:
  schemas:
    Pet:
      required:
        - name
        - photoUrls
      type: object
      properties:
        id:
          type: integer
          format: int64
          example: 10
        leak:
          type: string
          default: 
            $ref: '/etc/passwd'
        name:
          type: string
          example: doggie
      xml:
        name: pet

If we just do the below

import SwaggerParser from '@apidevtools/swagger-parser';

const documentSource = './document-shown-above.yml';
const doc = await SwaggerParser.dereference(documentSource);
console.log(doc.paths['/pet'].put.requestBody.content['application/json'].schema);

We will get (showing just a snippet of the processed document to keep it short):

{
  required: [ 'name', 'photoUrls' ],
  type: 'object',
  properties: {
    id: { type: 'integer', format: 'int64', example: 10 },
    leak: {
      type: 'string',
      default: 'nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false root:*:0:0:System Administrator:/var/root:/bin/sh daemon:*:1:1:System Services:/var/root:/usr/bin/false _uucp:*:4:4:Unix to Unix Copy Protocol:/var/spool/uucp:/usr/sbin/uucico _taskgated:*:13:13:Task Gate Daemon:/var/empty:/usr/bin/false _networkd:*:24:24:Network Services:/var/networkd:/usr/bin/false _installassistant:*:25:25:Install Assistant:/var/empty:/usr/bin/false _lp:*

I have checked the different configuration options available and I found that things can be somewhat mitigated by configuring the file resolver as shown below.

const doc = await SwaggerParser.dereference(documentSource, {
    resolve: {
        file: {
            canRead: ['.yml', '.json']
        }
    }
});
console.log(doc.paths['/pet'].put.requestBody.content['application/json'].schema);

In this case, the library is only willing to work with yml and json files, giving this result:

SyntaxError: Unable to resolve $ref pointer "/etc/passwd"
    at onError (/Users/keymandll/DEVEL/node_modules/@apidevtools/json-schema-ref-parser/lib/parse.js:85:20) {
  toJSON: [Function: toJSON],
  [Symbol(nodejs.util.inspect.custom)]: [Function: inspect]
}

At the same time, it is still possible to get any JSON or YAML files opened on the file system. Someone malicious could use this to gain access to sensitive data (e.g. credentials) in JSON or YAML files if:

As can be seen, there are plenty of ifs, but depending on the use case things can get problematic.

My recommendation would be (without intimately being familiar with the library) to:

  1. Have the options.resolve.file.canRead set to ['.yml', '.yaml', '.json'] by default. I mean, this should be the default behavior.
  2. Would be nice to have something like options.resolve.file.basePaththat allows defining a path from which the library is willing to load/resolve files. The library would refuse to load/resolve any files outside of the specified directory or directories.
philsturgeon commented 1 year ago

Anyone got a pull request? This library has been going for donkeys years without this, so that suggests people are sanetizing their own inputs, but if you need the library to make it safe by default I'd really appreciate a pull request.

I make absolutely nothing from this and its taking time out of tree planting or paid work, time I don't have.

keymandll commented 1 year ago

The way we mitigated it (canRead: ['.yml', '.json'] mentioned above) works perfectly fine for us, given there are no other yml or json documents with sensitive info around, just the ones we feed to the library for processing. Similarly to you, I do not have much time, so I thought I'd only put together a pull request if there was significant demand. It seems so far no one cared. One thing I could do in the upcoming days is to submit a PR with an updated README to let people know about this behaviour.

P.s.: Thanks a lot for this library.

philsturgeon commented 1 year ago

That would be great! Letting people know they can do that is handy, and it avoids the issue of disabling it by default and causing issues for people who $ref URLs that might not have extensions in (I've seen that done).

keymandll commented 1 year ago

The related pull request: https://github.com/APIDevTools/swagger-parser/pull/236