gramps-graphql / gramps--legacy

The core data source combination engine of GrAMPS.
https://gramps.js.org
MIT License
197 stars 13 forks source link

Should we add a way to pass in external executable schemas? #47

Open jlengstorf opened 6 years ago

jlengstorf commented 6 years ago

Porting the discussion on external executable schemas to its own issue so we don't lose track of it. Original discussion is copy-pasted below:

@mfix22 https://github.com/gramps-graphql/gramps/issues/26#issuecomment-352119917

@jlengstorf I think this is a very neat way to merge data sources. Unless I am reading this wrong, the only use case we have that this does not cover is for linking in a remote executable schema:

makeRemoteExecutableSchema({
  schema: builtSchema,
  link: HttpLink, // which points at my remote schema
});

Technically we could stitch this remote schema in after gramps() has been called but that seems messy.

@jlengstorf https://github.com/gramps-graphql/gramps/issues/26#issuecomment-352122456

@mfix22 Yeah, that's a limitation of GrAMPS — in order to keep the data sources simple, we need to run all data sources through GrAMPS to get an executable schema. I suppose it would be possible to add an extra option to allow additional executable schemas to be passed in.

Right now, we create an executable schema right before returning.

We could modify this so you'd call something like:

const external = makeRemoteExecutableSchema({
  schema: builtSchema,
  link: HttpLink, // which points at my remote schema
});

const GraphQLOptions = gramps({
  dataSources: [XKCD],
  externalSchemas: [external],
});

And then we'd modify the gramps() function to spread that new prop as well:

  const schema = mergeSchemas({
    schemas: [...schemas, ...externalSchemas, ...linkTypeDefs],
    resolvers,
  });

Theoretically this would Just Work™ and would even allow schema stitching between GrAMPS data sources and remote executable schemas.

However, I haven't worked with remote schemas yet, so I'm not sure if this would do what I think it'll do. Let me know if this looks like it'll work and I'll push a new beta package so you can try it out.

@mfix22 https://github.com/gramps-graphql/gramps/issues/26#issuecomment-352225241

@jlengstorf This is exactly how I am currently handling the use case! I am pretty sure it will Just Work™ 😄 I think that option would fill in the gap perfectly while keeping data sources simple.

@kbrandwijk https://github.com/gramps-graphql/gramps/issues/26#issuecomment-352256297

@mfix22 @jlengstorf An alternative approach would be to keep the gramps part contained to a single schema, and leave the stitching of other schemas to other emerging tooling (separation of concerns). I don't think this is messy at all.

@jlengstorf https://github.com/gramps-graphql/gramps/issues/26#issuecomment-352257476

@kbrandwijk I agree, but I wonder if it's worth including this for simple use cases since it's really just two lines of code. I may end up eating these words later, but I can't really see how this could cause problems later on.

I do agree that advanced use cases should probably be handled outside of GrAMPS, but I see a good use case in something like this:

  • I have a gateway that's built of my own custom data sources
  • I'm stitching them together with GrAMPS
  • I have a new requirement that means I need to include GitHub repos for users
  • I add the GitHub schema as a remote schema and stitch it together with my User type via GrAMPS

If it went too far beyond that, other tools would be easier/recommended, but I can see this being valuable.

Any objections to including it?

@kbrandwijk https://github.com/gramps-graphql/gramps/issues/26#issuecomment-352258460

@jlengstorf Well, I guess it wouldn't hurt anyone, but I would strongly advise against making this a best practice. Mainly because in most use cases, it will involve more than just calling makeRemoteExecutableSchema. For most endpoints, there's authentication, there's .graphqlconfig.yml for defining your endpoints, there's not wanting the entire remote GraphQL schema to become part of your final schema, etc.

@jlengstorf https://github.com/gramps-graphql/gramps/issues/26#issuecomment-352258843

Hmmm... that's a pretty solid argument for not including it. I haven't worked on remote schemas, so I'm pretty much going with the experience of the people using it.

@mfix22, do you have a counterargument here?

If @kbrandwijk's point holds true for all but the most trivial use cases, it does seem smart to leave out the feature and instead add an example of how to do this outside of GrAMPS. Otherwise I'd worry we're inadvertently introducing something we'd have to either deprecate later or add a big warning recommending people don't use it. (And if either of those is true, we may as well support the proper path out of the gate.)

Thoughts?

@mfix22 https://github.com/gramps-graphql/gramps/issues/26#issuecomment-352622460

I do see @kbrandwijk's point about not wanting to expose the entire 3rd party's schema as a definite concern here. That was a question I still want to ask one of the Apollo guys: what is the best way to expose part of a schema, especially a remote one?

As for the other concerns, GrAMPs already supports adding context and you can still add headers to handle authorization use-cases (more Express middleware for example).

For initial release, i don't think GrAMPS needs to support remote executable schemas. The escape hatch to include them might help some users though. What is nice is that including them doesn't hurt the users that don't.

Maybe the best argument for not including them at first is that once they are included, it would be a breaking change to remove them, but that is not true the other way around (once we know more about exposing a subset of a schema).

@jlengstorf https://github.com/gramps-graphql/gramps/issues/26#issuecomment-352641566

@mfix22 The Graphcool team just released a pretty interesting article on exposing subsets of underlying GraphQL schemas. There's a video that shows an approach to controlling data access. Check it out: https://blog.graph.cool/graphql-databases-a-preview-into-the-future-of-graphcool-c1d4981383d9

Maybe we can pull in some Apollo folks to weigh in? @stubailo, @peggyrayzis, or @jbaxleyiii — any thoughts on best practices for managing remote schemas as laid out in @kbrandwijk and @mfix22's comments above? (Also, 👋.)

lfades commented 6 years ago

I would love to see this because I want to create a project using multiple microservices with micro-js. are PRs welcome ?

jlengstorf commented 6 years ago

@Goluis Absolutely PRs are welcome!

Could you provide some pseudo-code or a use case for the external schemas? Specifically, how are you putting this together that doesn't work with the typical GrAMPS approach of putting the data sources together into a single executable schema?

(I'm asking so I can understand how other devs are structuring their code, not to try and shoot you down.)

kbrandwijk commented 6 years ago

@jlengstorf Can you explain to me how the stitching part works exactly. I assume that is for stitching multiple datasources together? But I'm wondering why the stitching part is defined on a single datasource. Maybe I'm missing something. The reason I'm asking it here is because of its relationship to merging with other schemas (like the external schemas discussed here).

lfades commented 6 years ago

@jlengstorf of course!

Currently data-sources are supposed to be merged with a gateway as shown here and they can also run their own gateway, but in production is the app gateway the one who merges the schemas and deploys.

Using remote schemas two things change:

  1. data-sources can be deployed too
  2. the gateway merges a remote schema

Some benefits of this approach:

Note: some benefits from microservices came from zeit docs, which offers a very easily implementation of microservices at scale

jlengstorf commented 6 years ago

@kbrandwijk The idea behind stitching in GrAMPS is, in a nutshell:

  1. I have two data sources that need to share data: User and Post
  2. I set Post as a peer dependency of User (and vice versa)
  3. The stitching prop of the User data source defines how the User type will expose data from the Post data source (and vice versa)

The idea behind this is to allow complex schemas to be broken up into data sources without forcing them to become silos.

I don't believe this would interfere with merging into other schemas, but we haven't tested that scenario yet.

@Goluis I think I see what you're going for, but I do want to clarify that the data sources are not independently deployable. The CLI will stand up a simple gateway for development, but the intention is that those data sources will ultimately be consumed by a GrAMPS gateway.

That being said, you could certainly put multiple GrAMPS gateways into production as independent µ-services and merge them via remote schemas into a main gateway.

One thing that GrAMPS is set up to do, though, is allow data sources to be included in multiple places.

So let's take those zeit docs as an example:

To start, we have the UI µ-service and a GraphQL endpoint:

The GraphQL endpoint has three data sources:

As the product scales, you decide to split into two GraphQL µ-services:

Let’s assume both Post and Register rely on User for normal operation. Since GrAMPS data sources can be installed on multiple gateways, there's nothing to stop you from including the User data source on both µ-services:

These wouldn't need to be stitched together at all, and you should get the same end result.

Am I understanding your use case correctly, or is there something I'm missing?

lfades commented 6 years ago

the difference is that if the data-source for User is changed it will need to be updated in graphql.example.org and graphql-register.example.org, but if User were a microservice with a remote schema (graphql-user.example.org for example) and the other gateways are using a link to its schema then only graphql-user.example.org will need to be updated

jlengstorf commented 6 years ago

Ah, good point.

I don't have any solid argument for not adding support for external schemas. And @kbrandwijk is thumbs-upping, which I assume means he's onboard.

@Goluis Does the API addition laid out in https://github.com/gramps-graphql/gramps/issues/47#issue-283307728 seem like it's going to work for your use case? If so, want to submit the PR? 😄

kbrandwijk commented 6 years ago

@jlengstorf Well, onboard with the reasoning from @Goluis that you don't want to include it in both, and having to update in multiple places. I'm still not 100% sure about the external schemas. But I think it comes down to the positioning of GrAMPS. Do you see it as a completely self-sufficient tool, then you need this as well. Do you see it as part of a toolchain, than I would focus on the main responsibilities of the package.

jlengstorf commented 6 years ago

@kbrandwijk I guess I'd make the argument that the maintainers of GrAMPS can't really make that decision for the devs using it. 😕

I can see it working in both contexts, depending on the structure of a given team. Maybe you could help us put together some architectural considerations for using external schemas (e.g. all the things that make you uneasy, so other devs can be aware of potential pitfalls)? Maybe we could work with @schickling to put this on https://www.advancedgraphql.com?

kbrandwijk commented 6 years ago

I think we should do a boilerplate for GrAMPS, to be used by graphql create, so we can figure out how these tools fit together. That worked wonders to communicate the concept with the existing boilerplates, and offers a great starting point for devs. Maybe we can work on that together? Feel free to ping me on Slack.

schickling commented 6 years ago

Maybe we could work with @schickling to put this on https://www.advancedgraphql.com?

This would be awesome! @nikolasburk is planning to give Advanced GraphQL a full rehaul in January. Would be great if you'd like to get involved!

Also feel free to create PRs for it before then ofc!

lfades commented 6 years ago

Here's the apollo example of how to add multiple remote schemas working: https://launchpad.graphql.com/q5kq9z15p

I already tried to add one of the example endpoints in top of gramps without success (I don't know why it didn't work, just that the schemas weren't merged)

// useRemoteSchemas.js
import { mergeSchemas } from 'graphql-tools'

export default (remoteSchemas, grampsOptions) => async req => {
  const options = grampsOptions(req)
  const remotes = remoteSchemas.map(schema => schema())
  const schema = mergeSchemas({
    // The remote schemas are fine but no merge occurs
    schemas: [options.schema, ...await Promise.all(remotes)]
  })

  return Object.assign(options, { schema })
}

// server.js
// Note: Only the relevant code is included
import { introspectSchema, makeRemoteExecutableSchema } from 'graphql-tools'
import { HttpLink } from 'apollo-link-http'
import fetch from 'node-fetch'
import XKCD from '@gramps/data-source-xkcd'
import useRemoteSchemas from './useRemoteSchemas'

const RemoteSchema = async () => {
  const link = new HttpLink({
    uri: 'https://v7l45qkw3.lp.gql.zone/graphql',
    fetch
  })

  return makeRemoteExecutableSchema({
    schema: await introspectSchema(link),
    link
  })
}

const GraphQLOptions = useRemoteSchemas(
  [RemoteSchema],
  gramps({
    dataSources: [XKCD]
  })
)

app.use('/graphql', graphqlExpress(GraphQLOptions))

Including remote schemas right now will require changes to the api and probably context, btw check out this current discussion about possible incoming changes to stitching schemas

kbrandwijk commented 6 years ago

Adding remote schemas will be a lot easier with the changes proposed in #62 and #63.