Less strict validation of input objects

glan commented 8 years ago

I'm wondering if we really need raise an error wen additional field is passed as part of an input object.

see: https://github.com/graphql/graphql-js/blob/master/src/utilities/isValidJSValue.js#L67-L72

Any additional fields could just be ignored or dropped and warnings rather than errors produced.

leebyron commented 8 years ago

The primary case we're looking to highlight here is accidental typos, which happen time to time and can be hard to track down, especially if they're used for a variable or input field that's only occasionally used.

If you have the query query ($foo: String) { field(arg: $foo) } and the variables { "fooo": "abc" }, we currently flag this as an error, but we could potentially miss this typo if we did not raise errors.

glan commented 8 years ago

@leebyron Sorry I think you misunderstood my suggestion. I'll try and explain in more detail. This error is raised when type checking the contents of a var with input schema types, for example if the following schema is created:

new GQL.GraphQLInputObjectType({
  name: 'CoordsInput',
  fields: {
    lat: { type: new GQL.GraphQLNonNull(GQL.GraphQLFloat) },
    long: { type: new GQL.GraphQLNonNull(GQL.GraphQLFloat) }
  }
});

with this query: query ($coords: CoordsInput) { field(coords: $coords) }

and the input variable:

{
  "coords": {
    "lat": 37.5,
    "long": -122.4,
    "alt": 105.7
  }
}

Then that results in an Unknown field error since alt is not in the schema. I believe that we shouldn't throw an error here, or at least have some kind of flag which suppresses this error.

The reason I highlight this issue is that I think it's a case the robustness principle:

Be conservative in what you send, be liberal in what you accept.

https://en.wikipedia.org/wiki/Robustness_principle

Which seems to me what GraphQL embraces.

Cheers

leebyron commented 8 years ago

I'm open to making a change like you suggest, but I want to make sure we cover all the ramifications.

The robustness principle is a good one, however the competing issue is avoiding mistakes.

The example you provide might be a good example of this, a possible misinterpretation by the client that the server supports an "alt" property. The client might write code which assumes that this property is being used, and then express surprise when they find the results seem to disregard it and assume it's a bug from the server when in fact it wasn't supported in the first place.

Another possible mistake that I think is far more likely to occur involves a slight modification to your example:

new GraphQLInputObjectType({
  name: 'CoordsInput',
  fields: {
    lat: { type: GraphQLFloat, defaultValue: 48.8584 },
    long: { type: GraphQLFloat, defaultValue: 2.2945 }
  }
});

{
  "coords": {
    "lat": 37.5,
    "lon": -122.4,
  }
}

In this case (a bit contrived since I only modestly modified your example) a input object field is misspelled, that input object field is nullable so it uses it's default value as it wasn't provided and the user finds themselves in the middle of the Mediterranean due to their typo.

leebyron commented 8 years ago

A different example in support of the robustness principle is "push safety". Consider the case that a server is adding a field to an input object:

Server at revision R:

new GraphQLInputObjectType({
  name: 'CoordsInput',
  fields: {
    lat: { type: GraphQLFloat },
    lon: { type: GraphQLFloat }
  }
});

Server at revision R+1:

new GraphQLInputObjectType({
  name: 'CoordsInput',
  fields: {
    lat: { type: GraphQLFloat },
    lon: { type: GraphQLFloat },
    alt:  { type: GraphQLFloat },
  }
});

If a client is on a different deploy cycle (ex. an iOS client), when should/can it deploy code which uses the alt input object field? In the current spec, it must wait until the R+1 server has been fully deployed. In this way, the client is not "backwards compatible" in that it cannot send this newer query to a server that still has revision R. If we allow the runtime execution of submitted values to include fields which are not defined, then an iOS client could deploy this new change before the R+1 server is deployed, knowing that the change would simply be ignored by server R but that the new behavior would be immediately enabled when it connects to a server with rev R+1.

glan commented 8 years ago

Funny that lon has sometimes gotten used instead of long (since we have quite a few parts in our system). In our case we are using GQL.GraphQLNonNull to enforce that those two fields are present.

leebyron commented 8 years ago

I'm curious to get @dschafer's opinion on this.

glan commented 8 years ago

To give you a little more background: We have complex object (named: interpretation) which created by one system (a) and then passed into the our GraphQL query as variable. Which in turn gets passed on to another system (b) for use.

We have a schema in place that describes what this interpretation input object should look like. However if a new property gets added to the object by system (a) we don't what the GraphQL query to fail.

leebyron commented 8 years ago

Thanks, that's good info to know. Definitely supports the desire to be more robust to input.

dschafer commented 8 years ago

Hm, I'm still not convinced we should abandon the strictness principle here. I can't think of a case where a client was forced to violate strictness; because we have introspection, a client can always guarantee that it forms the right input shape, and I think it should: the client knows better how to convert an input to the correct shape than the server does.

To give you a little more background: We have complex object (named: interpretation) which created by one system (a) and then passed into the our GraphQL query as variable. Which in turn gets passed on to another system (b) for use.

Can you go into more detail here? If (a) and (b) disagree on what the shape of the interpretation variable is, then there needs to be some source of truth about how to convert the output of (a) to the input of (b).

In this scenario, the variable is passed to (b), so I would expect the GraphQL input type system to expose the requirements for the input of (b). On the client, then, where we have the output of (a), we would now be responsible for converting to the (b) format.

It's possible that the right behavior is to drop the extra fields, but I don't think that's always categorically the case. If it is the case here, then I think we should do that on the client and send a valid (b) input up. Because we have introspection, the client can always know what the expected shape of the input is; assuming the server doesn't make a breaking change, that expected shape of the input is always valid, and so the client should coerce whatever it is given into the input shape as it sees fit.

On the other hand, the right behavior might not be to drop the field. If our input on the client has height, width, scale, and the server doesn't know about scale, then the behavior on the client might not be to ignore scale, it might be to multiply height and width by scale, and then remove it.

In either case, the client knew what the server was expecting from introspection; I think it should be responsible for coercing things into the shape, rather than the server making a best guess.

glan commented 8 years ago

Here is an overview of the architecture:

System (a) is responsible for creating the interpretation object. It can in theory send whatever it likes. (In our case (a) is a service written in go).
The GraphQL layer defines a schema for the interpretation input, but that may only be as subset of what (a) is sending.
GraphQL should throw an error if anything which is defined in the schema incorrectly typed or missing when typed as GraphQLNonNull from the input. Basically it's enforcing the supported known properties.
System (b) can safely use the schema defined properties, but should also be able unsafely use anything else that got passed though that wasn't in the schema.
System (a), the GraphQL layer and system (b) could all be running on different revisions.

As an alternative approach we did think about the following options:

a) Serialize the contents of the interpretation as a string and make it opaque to the GraphQL layer. b) Pre-filtering the output of (a) against the defined GraphQL interpretation schema before passing in as variable.

helfer commented 8 years ago

@glan: This discussion seems to be going in the direction of #324, so you might be interested in reading that, and the corresponding changes to the spec.

I can really see it go both ways:

On the one hand, there are advantages for clients to be able to send queries they are guessing might be valid, and get back as much as can be unambiguously interpreted. This has the advantage that the client doesn't need to know the schema and doesn't need to do validation.
On the other hand, it can be beneficial to have a server that strictly validates the schema and doesn't let you run queries that don't make any sense given the schema, because being too lenient could lead to a situation similar to what happened in the early browser wars, where browsers started to accept malformed HTML and tried their best to render it the way it was intended, which lead to fragmentation.

Maybe in the end the answer depends on the particular use case. Until there's a use-case for which there is no better solution than to relax the rules, it probably makes sense to keep them strict.

glan commented 8 years ago

I've opened this PR: https://github.com/graphql/graphql-js/pull/343

which introduces tolerateAdditionalProps: true on the input type schema which relaxes the checking slightly.

JeffRMoore commented 8 years ago

In the past I built an API with this behavior and came to regret it. I received complaints from developers when they encountered the typo problem. I could not change the behavior after clients came to rely on it.

If the extra fields are not stripped BEFORE resolving (pass through), that opens a new attack surface that schema developers must become aware of in their resolve functions.

I wouldn't consider this behavior a good practice because of the error message issue, unnecessary transmission of data over the network, and increased attack surface.

JeffRMoore commented 8 years ago

Let me elaborate on the change case. Client sends 'alt' field which makes sense in their interpretation of the object. Server ignores it. All is good. Later, server adds an 'alt' field, which makes sense in their interpretation of the object. The definitions differ. Now the server has broken the client. Better to resolve this up front when the client is developed. If not, any important client that sends extra data has now "reserved" a name on the server. One now has to perform analytics to add a new field to avoid breaking clients on a field add, even for an optional field.

glan commented 8 years ago

@JeffRMoore Thanks for the feedback you make a good point. I think the use case for this is a little atypical. The system which creates the input object isn't a client, it's a service which I'd like to to own what the full specification of the object is. We could think of the GraphQL schema for the input object as more like an interface.

glan commented 8 years ago

@leebyron

I was just looking at alternative approach by creating an Any object type in my schema. @mlp5ab is this similar to what you guys were doing?

const GraphQLAny = new GQL.GraphQLScalarType({
  name: 'Any',
  description: 'Any object',
  serialize(x) {
    return JSON.stringify(x);
  },
  parseValue(x) {
    return JSON.parse(x);
  },
  parseLiteral(ast) {
    let obj = {};
    ast.fields.forEach(x => obj[x.name.value] = x.value.value);
    return obj;
  }
});

lisbakke commented 7 years ago

It looks like the PR has gone silent. Is there still a plan to address this issue?

Thanks!

wasauce commented 7 years ago

I'd love to have this type of functionality for my project -- are there any updates on if this PR might land? Or other suggested work arounds? Thank you!

leebyron commented 7 years ago

This is being tracked in the specification since the change would break spec compatibility - https://github.com/facebook/graphql/issues/235

IvanGoncharov commented 6 years ago

As Lee pointed out it required changes to the Specification and should be discussed in facebook/graphql#235

pankaj-crypto commented 4 years ago

Was this ever taken care of?

jakwuh commented 3 years ago

For those who is looking for solution:

new ApolloClient({
        link: concat(cleanVariablesMiddleware, ...),

const cleanVariablesMiddleware = new ApolloLink((operation, forward) => {
    operation.variables = cleanVariables(operation.query, operation.variables)

    return forward(operation);
  })

import { getOperationDefinitionOrDie } from 'apollo-utilities'
import { DocumentNode, IntrospectionType } from 'graphql'
import { isObjectLike, reduce } from 'lodash'

import {
  getIntrospectionTypeFromInputTypeRef,
  getIntrospectionTypeFromTypeNode,
  isIntrospectionInputObjectType,
} from './utils'

type Variables = Record<string, any>

const reduceVariables = <T>(
  items: readonly T[],
  getVarName: (item: T) => string,
  getIntrospectionType: (item: T) => [IntrospectionType, boolean],
  variables: any,
) => {
  if (!variables || !isObjectLike(variables)) return variables

  return reduce(items, (vars: Variables, item: T) => {
    const varName = getVarName(item)
    const [type, isList] = getIntrospectionType(item)

    if (!(varName in variables)) return vars

    return {
      ...vars,
      [varName]: extractInputVariables(type, variables[varName], isList),
    }
  }, {})
}

const extractInputVariables = (type: IntrospectionType, variables: any, isList: boolean): Variables => {
  if (isList) {
    if (Array.isArray(variables)) return variables.map(v => extractInputVariables(type, v, false))
    else return variables
  }

  if (!isIntrospectionInputObjectType(type)) return variables

  return reduceVariables(
    type.inputFields,
    inputField => inputField.name,
    inputField => getIntrospectionTypeFromInputTypeRef(inputField.type),
    variables,
  )
}

export const cleanVariables = (query: DocumentNode, variables?: Variables) => {
  const { variableDefinitions = [] } = getOperationDefinitionOrDie(query)

  return reduceVariables(
    variableDefinitions,
    variableDefinition => variableDefinition.variable.name.value,
    variableDefinition => getIntrospectionTypeFromTypeNode(variableDefinition.type),
    variables,
  )
}

import { NamedTypeNode, TypeNode, VariableDefinitionNode } from 'graphql'
import { IntrospectionInputObjectType, IntrospectionType } from 'graphql'
import {
  IntrospectionInputType,
  IntrospectionInputTypeRef,
  IntrospectionNamedTypeRef,
} from 'graphql'
import { get } from 'lodash'

export const isNamedTypeNode = (definition: TypeNode | VariableDefinitionNode): definition is NamedTypeNode => {
  return definition.kind === 'NamedType';
}

export const isListTypeNode = (definition: TypeNode | VariableDefinitionNode): definition is NamedTypeNode => {
  return definition.kind === 'ListType';
}

export const isIntrospectionNamedTypeRef = (inputTypeRef: IntrospectionInputTypeRef): inputTypeRef is IntrospectionNamedTypeRef<IntrospectionInputType> => {
  return Boolean(get(inputTypeRef, 'name'))
}

export const isIntrospectionListTypeRef = (inputTypeRef: IntrospectionInputTypeRef): inputTypeRef is IntrospectionNamedTypeRef<IntrospectionInputType> => {
  return inputTypeRef.kind === 'LIST';
}

export const isIntrospectionInputObjectType = (type: IntrospectionType): type is IntrospectionInputObjectType => {
  return type.kind === 'INPUT_OBJECT'
}

import { IntrospectionType, TypeNode } from 'graphql'
import { IntrospectionInputTypeRef } from 'graphql/utilities/getIntrospectionQuery'
import { keyBy } from 'lodash'

import introspection from '~/generated/introspection.json'

import { isIntrospectionListTypeRef, isIntrospectionNamedTypeRef, isListTypeNode, isNamedTypeNode } from './validators'

const types: Record<string, IntrospectionType> = keyBy(introspection.__schema.types as IntrospectionType[], 'name')

export const getIntrospectionTypeFromTypeNode = (definition: TypeNode, isList = false): [IntrospectionType, boolean] => {
  if (isNamedTypeNode(definition)) return [types[definition.name.value], isList]

  return getIntrospectionTypeFromTypeNode(definition.type, isList || isListTypeNode(definition))
}

export const getIntrospectionTypeFromInputTypeRef = (inputTypeRef: IntrospectionInputTypeRef, isList = false): [IntrospectionType, boolean] => {
  if (isIntrospectionNamedTypeRef(inputTypeRef)) return [types[inputTypeRef.name], isList]

  return getIntrospectionTypeFromInputTypeRef(inputTypeRef.ofType, isList || isIntrospectionListTypeRef(inputTypeRef))
}

This code automatically removes all excess variables from the query/mutation input.

In order to build introspection.json use graphql-codegen

  frontend/src/generated/introspection.json:
    plugins:
      - introspection

graphql / graphql-js

Less strict validation of input objects #303