apollographql / apollo-rs

Spec compliant GraphQL Tools in Rust.
Apache License 2.0
574 stars 45 forks source link

compiler: two separate DirectiveList types are inconvenient #851

Open SimonSapin opened 6 months ago

SimonSapin commented 6 months ago

In apollo-compiler version 1.0.0-beta.15 we have a schema::DirectiveList type used for the directive applications on the schema definition or of a type definition, and a different ast::DirectiveList type for everything else. This difference is often unexpected, and an obstacle e.g. to write a function taking a directive list as a parameter.

Background

apollo_compiler::ast contains a GraphQL representation that closely follows syntax. A Document contains a Vec of Definitions, where e.g. a ObjectTypeExtension and its corresponding ObjectTypeDefinition are separate and can appear at any positions.

apollo_compiler::schema provides a higher-level representation of a schema / type system. The main differences from AST are:

However there’s at least one case where a user does care. The Federation directive @shareable applies to field definitions, but as a shortcut can be put on a type declaration to implicitly apply to all of its fields. But that implicit "propagation" only happens for the syntactic block where it’s used:

type A {
    field1: Int
}
type B @shareable {
   field1: Int
}
extend type A @shareable {
    field2: Int
}
extend type B {
   field2: Int
}

Is equivalent to:

type A {
    field1: Int
    field2: Int @shareable
}
type B {
   field1: Int @shareable
   field2: Int
}

Status quo

As of beta 15 most things are wrapped in a Node<_> smart pointer, but things that can be contributed by an extension are instead wrapped in Component<_> which adds a ComponentOrigin. A type-level @shareable applies to a given field iff their origins compare equal:

const SHAREABLE: &str = "shareable";

let schema = Schema::parse(schema, "schema.graphql").unwrap();
for ty in schema.types.values() {
    match ty {
        ExtendedType::Object(ty) => for_ty(&ty.name, &ty.directives, &ty.fields),
        ExtendedType::Interface(ty) => for_ty(&ty.name, &ty.directives, &ty.fields),
        _ => {}
    }
}

fn for_ty(
    ty_name: &schema::Name,
    ty_directives: &schema::DirectiveList,
    fields: &IndexMap<schema::Name, schema::Component<schema::FieldDefinition>>,
) {
    let shareable_origin = ty_directives.get(SHAREABLE).map(|d| d.origin.clone());
    for field in fields.values() {
        if field.directives.has(SHAREABLE)
            || shareable_origin.as_ref().is_some_and(|o| *o == field.origin)
        {
            println!("Shareable: {}.{}", ty_name, field.name)
        }
    }
}

Type directives and field definitions are components, but field directives are not. So we end up with two DirectiveList types wrapping either Vec<Component<Directive>> or Vec<Node<Directive>>. The majority of cases that don’t care about origins still have to deal with that separation.

It’s possible to write generic code, either:

Can we improve on this?

Alternative 1: add a trait

We could have a trait implemented by both DirectiveList types that has .get(name) and other methods they have in common. But what should this trait be named, and what module should it be in?

A trait needs to be imported specifically to be used, so it’s less discoverable than inherent methods

Alternative 2: directives are always components

Have a single DirectiveList type that contains Vec<Component<Directive>>. In a lot of cases (including AST and ExecutableDocument) we end up with a ComponentOrigin that is part of the data structure but not meaningful.

Alternative 3: directives are never components

Have a single DirectiveList type that contains Vec<Node<Directive>>. On type/schema definitions where directive origins are meaningful, store them out of line in a separate Vec<ComponentOrigin> that should be zipped with the directive list.

As most use cases don’t care about origins, when a directive list is mutated it is likely that origins will be left out of sync. If directives are only added we (e.g. in serialization code) can manage by using zip_longest instead of zip, but if a directive is removed the position correspondence is broken for any directive that comes after.

There is some precedent for allowing users to create something inconsistent: in https://github.com/apollographql/apollo-rs/issues/708 we added a name struct field redundant with map keys. Omitting it would make the compiler enforce this consistency, but at the cost of inconvenience.

Alternative 4: remove Component and ComponentOrigin altogether

A more radical change would be to decide that the high-level Schema representation does not preserve extensions at all. The specific case of @shareable can be dealt with at the AST level by moving relevant directives to field definitions:

let mut doc = ast::Document::parse(schema, "schema.graphql").unwrap();
for def in &mut doc.definitions {
    match def {
        Definition::ObjectTypeDefinition(def) => {
            let ty = def.make_mut();
            for_def(&ty.directives, &mut ty.fields)
        }
        Definition::InterfaceTypeDefinition(def) => {
            let ty = def.make_mut();
            for_def(&ty.directives, &mut ty.fields)
        }
        Definition::ObjectTypeExtension(def) => {
            let ty = def.make_mut();
            for_def(&ty.directives, &mut ty.fields)
        }
        Definition::InterfaceTypeExtension(def) => {
            let ty = def.make_mut();
            for_def(&ty.directives, &mut ty.fields)
        }
        _ => {}
    }
}

fn for_def(ty_directives: &ast::DirectiveList, fields: &mut [Node<ast::FieldDefinition>]) {
    if let Some(shareable) = ty_directives.get(SHAREABLE) {
        for field in fields {
            if !field.directives.has(SHAREABLE) {
                field.make_mut().directives.push(shareable.clone())
            }
        }
    }
}

let schema = doc.to_schema().unwrap();