Open Borduhh opened 2 years ago
Hi @Borduhh 👋 re: IntrospectAndCompose
and central caching schema files - we generally recommend shifting left on composition to get it out of each gateway's runtime in production and moved into your build pipeline to generate a single static supergraph schema that can be deployed to each gateway. This helps with a variety of things as below.
In general, once a given subgraph (fleet) is available to serve an updated schema, it's published to the schema registry using rover subgraph publish
which can accept the output of rover subgraph introspect
.
This can be done with a form of rover subgraph introspect | rover subgraph publish
from:
See the Federation docs for details
⚠️ We strongly recommend against using IntrospectAndCompose in production. For details, see Limitations of IntrospectAndCompose.
The IntrospectAndCompose option can sometimes be helpful for local development, but it's strongly discouraged for any other environment. Here are some reasons why:
- Composition might fail. With IntrospectAndCompose, your gateway performs composition dynamically on startup, which requires network communication with each subgraph. If composition fails, your gateway throws errors and experiences unplanned downtime. With the static or dynamic supergraphSdl configuration, you instead provide a supergraph schema that has already been composed successfully. This prevents composition errors and enables faster startup.
- Gateway instances might differ. If you deploy multiple instances of your gateway while deploying updates to your subgraphs, your gateway instances might fetch different schemas from the same subgraph. This can result in sporadic composition failures or inconsistent supergraph schemas between instances. When you deploy multiple instances with supergraphSdl, you provide the exact same static artifact to each instance, enabling more predictable behavior.
What you have with the Gateway willSendRequest
looks right, and you could do something similar in your pipeline deployment script, generating the AWS AuthV4 headers and passing them to rover subgraph introspect --header
.
See:
rover subgraph introspect --header
should work for your scenario today.
Expansion of issue https://github.com/apollographql/federation/issues/349#issuecomment-1104128473
We are trying to use Apollo Federation with AWS services (i.e. AppSync) and have the following constraints that might apply to a lot of other companies.
IAM Support for Apollo Studio
We cannot use Apollo Studio because all of our services are created and authenticated using AWS IAM. It would be nice if we could give Apollo Studio an ID key and Secret from an IAM Role that would be used to authenticate all of our requests. Right now we do that manually like so:
IntrospectAndCompose
is all or nothingRight now, the
IntrospectAndCompose.initialize()
method fails completely if even one service has a network timeout, which makes it almost impossible to use in production scenarios. For each service we add to our gateway, we increase the likelihood of a network error that cancels the entire process inevitably causing downtime or CI/CD failures.To solve this, it would be rather easy to have
loadServicesFromRemoteEndpoint()
process schema fetching on a per-service basis. This could be hyper-efficient by wrappingdataSource.process()
with a retry counter and retrying5xx
errors. That way the user can choose how many times they want to retry beforeIntrospectAndCompose
fails altogether and rolls back.Right now we are manually adding retries around the entirety of
IntrospectAndCompose
but as we add more services, this becomes really inefficient (I.E., if we have 150 services and service 148 fails, we still need to re-fetch services 1 through 147 on the next attempt).Central Caching Schema Files
This isn't something that necessarily needs to be done by Apollo, but is something that is required for microservices. Our team currently uses S3 to cache a schema file since in our case we can be relatively confident that it will not change without the services being redeployed. The first (and sometimes possibly second) ECS container that comes online builds it's own schema using
IntrospectAndCompose
and then stores the cached file with a unique per-deployment ID that other service can use when they scale to fetch the cached schema.