neo4j / graphql

A GraphQL to Cypher query execution layer for Neo4j and JavaScript GraphQL implementations.
https://neo4j.com/docs/graphql-manual/current/
Apache License 2.0
504 stars 149 forks source link

Query/Mutation params: null vs undefined #236

Open DomVinyard opened 3 years ago

DomVinyard commented 3 years ago

Behaviour is different when passing null or undefined to a query or mutation.

query($id: ID) {
  user(where:{id: $id}) {
    name
  }
}

In this example, if $id is null then no users are returned. if $id is undefined then all users are returned.

This is especially dangerous when dealing with mutations:

mutation ($name: String, $teamID: ID) {
  createUsers(
    input: {
      name: $name
      team: { connect: { where: { id: $teamID } } }
    }
  ) {
    users {
      id
    }
  }
}

If $teamID is null, this user will be connected to no teams (as expected). If $team ID is undefined this user will be connected to EVERY team! I guess if the database included millions of teams this would be enough to cause serious problems.

darrellwarde commented 3 years ago

Out of interest, what should the behaviour be here in your opinion?

I ask because null and undefined are obviously two very different argument values - the former usually being intentional, and the latter usually due to lack of assignment.

When we see a null argument value, we do some very intentional Cypher translation which is to replace any of our normal operators with an IS NULL or equivalent operator.

Based on the behaviour here, it sounds like any undefined argument values are essentially being stripped out of the Cypher, essentially leaving:

query($id: ID) {
  user(where:{ }) {
    name
  }
}

This would behave as you describe, returning all User values in the database... And honestly I'm on the fence as to whether or not this is even the incorrect behaviour.

Treating null and undefined argument values the same doesn't really make sense in my opinion, so curious to see what you had in mind for this.

DomVinyard commented 3 years ago

I would lean towards under-fetching here, assuming that the act of specifying where in the query necessarily implies the desire to filter (unless a condition is given which is explicitly all-inclusive). Especially with the mutation case. It's very to make a mistake with a wayward variable during development and throw loads of junk relationships into the database (I just did this myself).

mutation ($teamID: ID) {
  createUsers(
    input: {
      team: { connect: { where: { id: $teamID } } }
    }
  )
}

// team object is unintentionally undefined

// dangerous
const variables = { teamID: team?.id }

// safe
const variables = { teamID: team?.id || null }

Treating null and undefined argument values the same doesn't really make sense in my opinion

In pure semantic terms, I suppose "give me all records where id is null" would return all records with the id property as IS NULL, and "give me all records where id is undefined" would return all records without an id property present at all?.

In this case though, my initial instinct says the safer solution is to treat undefined as an invalid argument.

darrellwarde commented 3 years ago

I agree that the current behaviour is open to mistakes being made - an inconvenience in development, but a potential nightmare in production. I would definitely say two options here:

Curious to see is @danstarns has any thoughts on the above, or any alternative suggestions?

As a slight side note, was doing a bit of general research on this and came across this comment: https://github.com/graphql/graphql-js/issues/731#issuecomment-284978418

null is the undefined or None of GraphQL

This observation definitely fits into the second option above I think.

danstarns commented 3 years ago

Wouldnt it be a lovely world if all languages and runtimes had only one bottom value? Generally, I think null is the appropriate value to be used for filtering and undefined has no effect. We cannot actually do much about undefined as my example below shows:

const { makeExecutableSchema } = require("@graphql-tools/schema");
const { graphqlSync } = require("graphql");

const typeDefs = `
  input Input {
    here: String
    notHere: String
  }

  type Query {
    test(input: Input!): String
  }
`;

const resolvers = {
  Query: {
    test(root, { input }) {
      console.log(input);
      /*
        { here: 'I am here' } <----------------------------------------------------------------- 👀
      */
      return "test";
    },
  },
};

const schema = makeExecutableSchema({
  typeDefs,
  resolvers,
});

const query = `
  query($input: Input!) {
    test(input: $input)
  }
`;

const variableValues = {
  input: {
    here: "I am here",
    notHere: undefined,
  },
};

const response = graphqlSync({ schema, source: query, variableValues });

console.log(response);

Notice how the logged input object does not contain the notHere key. Meaning the undefined values are filtered out before it reaches our translation layer.

Finally, on the worry about a potential nightmare in production, I will suggest securing your API with the @auth directive to significantly reduce the surface area of attack.

darrellwarde commented 3 years ago

That's certainly an interesting observation that arguments with value undefined don't even reach the resolver - I wonder if they're in the ResolveInfo object? Nice example. 🏆

Perhaps there is something we can do with "empty where" arguments, and instead have the absence of that argument meaning "match everything" and throw and error if it's empty?

DomVinyard commented 3 years ago

That seems sensible at first blush. Is it as simple as "An empty input is an invalid input." Are there any instances where an empty input would be desirable?

darrellwarde commented 3 years ago

That seems sensible at first blush. Is it as simple as "An empty input is an invalid input." Are there any instances where an empty input would be desirable?

There are... This would be quite an API redesign, and it would change the expectations of a lot of users.

It's gonna require a bit of thought, and probably bigger fish to fry in the meantime.

jbhurruth commented 3 years ago

I just had the issue of accidentally updating hundreds of nodes with the same value after I failed to include a variable with a mutation client side.

At first I was annoyed that neither Apollo Client nor Server had caught the missing variable. Subsequent reading suggests this is avoidable by making the argument type required! in the mutation but unfortunately for me that means modifying Zeus to force require arguments in our type definitions since it provides no way to differentiate without hardcoding the mutation in the schema.

Still, it was interesting and I think this scenario reinforces the value of undefined being treated as missing since GraphQL is seemingly designed around the possibility.

Ideally for me this library would expose an override flag for enforcing variable existence when passing a where condition but I'm not sure of this is actually possible.

rcbevans commented 3 years ago

In my schema I resorted to writing my mutations manually, and created <Type>Value and Maybe<Type>Value input types where the input itself is optional, expressing omission, with a single value field which is either required, or optional of type Type, depending if the input is Maybe or not.

So a MaybeStringValue is

input MaybeStringValue {
    value: String
}

In my schema builder I just iterate over the scalar types I want to support, and generate required, maybe, and array variants.

That way the caller can express the semantic difference between wanting a value set to null, vs not specifying a value to change.