mickhansen / graphql-sequelize

GraphQL & Relay for MySQL & Postgres via Sequelize
MIT License
1.9k stars 172 forks source link

Improve where #506

Closed Alexsey closed 7 years ago

Alexsey commented 7 years ago

This pull request addresses two issues:

  1. It was impossible to pass "where" as query variable. Now supported in form of JSON string
  2. It was impossible to pass parmeters inside "where" as query variables https://github.com/mickhansen/graphql-sequelize/issues/468. Now it can be done
mickhansen commented 7 years ago

I'll take a solid look at this but i'm generally learning towards less query magic as i definitely feel graphql should be more explicit in arguments accepted - graphql-sequelize will likely move towards being purely resolver helpers (i think attributeFields is slightly harmfull in the long term also, atleast it should be refactored so that fields must be defined)

codecov-io commented 7 years ago

Codecov Report

Merging #506 into master will increase coverage by 2.31%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #506      +/-   ##
==========================================
+ Coverage   92.72%   95.03%   +2.31%     
==========================================
  Files          11       11              
  Lines         371      383      +12     
==========================================
+ Hits          344      364      +20     
+ Misses         27       19       -8
Impacted Files Coverage Δ
src/resolver.js 98.03% <100%> (+0.47%) :arrow_up:
src/types/jsonType.js 61.9% <100%> (+40.85%) :arrow_up:
src/argsToFindOptions.js 94.73% <0%> (+5.26%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 911da47...77bb7f5. Read the comment docs.

janmeier commented 7 years ago

Adding official support makes users think that this is the correct way to do it (which I think it generally isn't). It would open users up to a plethora of data leaks.

In my opinion, this is similar to accepting SQL in a query parameter and using it without any sanitation

Alexsey commented 7 years ago

@janmeier but it doesn't adds anything that was not there before: "where" already could accept any values. I have just add support of query variables for it

mickhansen commented 7 years ago

Fair point @Alexsey, discussions regarding removing this functionality will be for the next major

mickhansen commented 7 years ago

v5.4.0

RichAyotte commented 7 years ago

In my opinion, this is similar to accepting SQL in a query parameter and using it without any sanitation

Ouch, I hope not. I thought Sequelize sanitized the parameters. Can you show an example injection using where in this way?

janmeier commented 7 years ago

Not in the "traditional" sense of injecting DROP TABLE users, but in the sense of leaking more data than the user might think.

Alice whats to allow Bob to look for users matching either idA or idB, so she exposes the following interface:

query($where: SequelizeJSON) {
  user(where: $where) {
    name
  }
}

Which Bob uses

user({ where: { id: { $or: [idA, idB] }}})

However, Eve comes along

user({ where: { id: { $gt: 0 }}})

And suddenly she loaded all users in the database, even though the endpoint only ever legimately needed to be used to load two users.

This is not a direct leak on information (hopefully Alice wrote some more checks to ensure that Bob and Eve are only allowed to load users they can actually access).

Explicitly adding arguments (in this case ids, perhaps with a limit of how many you can load) forces you to think about what your frontend actually needs to query on, and taking the corresponding precautions.

This is also slightly similar to

router.get('/user', () => {
  return User.findAll({
    where: {
      id: req.query.id
    }
  });
});

// /user?id=42 Gives one user
// /user?id={ $gte: 0 } - Gives all users, if json parsing is enabled for request params
RichAyotte commented 7 years ago

Access control is definitely more difficult with this kind of flexibility.

To guarantee that the permissions for each table have been defined and are applied to each query, this should probably be done.

new Sequelize(
    ...
    , {
        define: {
            hooks: {
                beforeFind
            }
        }
    }
)

function beforeFind(options) {
    if (!options) {
        throw new Error(`Options missing from query.`)
    }

    if (!options.user) {
        throw new Error(`User has insufficient privileges.`)
    }

    if (!this.permissions) {
        throw new Error(`Permissions for ${this.name} are not defined. Access Denied.`)
    }

    // Get the user role and reduce query scope if required.
    const scopedOptions = optionsToScopedOptions({
        permission: scopeOptionToPermission(options.scope)
        , options
    })
    return scopedOptions
}
janmeier commented 7 years ago

That seems like a good pattern for solving this pattern generally, and making sure it's applied to all queries

RichAyotte commented 7 years ago

One last thing. I think that this type of guarding should be documented.

I ran into a out of memory issue today.

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

<--- Last few GCs --->

[31179:0x4c8226f590]   271959 ms: Mark-sweep 1410.2 (1475.5) -> 1410.1 (1459.5) MB, 5952.4 / 0.0 ms  (+ 0.0 ms in 0 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 5953 ms) last resort 
[31179:0x4c8226f590]   277766 ms: Mark-sweep 1410.1 (1459.5) -> 1410.1 (1459.5) MB, 5806.6 / 0.0 ms  last resort 

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x160fc859bbd9 <JS Object>
    1: _then [/home/rich/Projects/omn/backend/node_modules/bluebird/js/release/promise.js:~219] [pc=0x384d43e742ff](this=0x11a9b8b68669 <a Promise with map 0x32ce334e49c1>,didFulfill=0x1a66609d08a1 <JS Function (SharedFunctionInfo 0x2a56be6fde1)>,didReject=0x160fc8582241 <undefined>,_=0x160fc8582241 <undefined>,receiver=0x160fc8582241 <undefined>,internalData=0x160fc8582241 <undefined>)

Had a bunch of queries with 40,000 and 50,000 digits in the IN() clause.

SELECT *
FROM `ProjectServiceAddresses` AS `ProjectServiceAddresses`
WHERE `ProjectServiceAddresses`.`project_service_id`
IN (101, 102 ... (40,000 INTs) ... 55338, 55339)
ORDER BY `ProjectServiceAddresses`.`id` ASC;

I ended up adding this guard in the beforeFind hook.

// Enforce a limit of 1000 rows and log when it's exceeded.
if (!options.limit || options.limit > 1000) {
  options.limit = 1000
  log.warn(`SQL limit clause missing or exceepds ${options.limit}. Forcing to ${options.limit}.`)
  log.debug(options)
}

It's the same defense strategy as the malicious where clause.