apollographql / federation

🌐  Build and scale a single data graph across multiple services with Apollo's federation gateway.
https://apollographql.com/docs/federation/
Other
667 stars 254 forks source link

Federated field migration results in a full composition failure #750

Open AndrewRayCode opened 3 years ago

AndrewRayCode commented 3 years ago

Apollo-server 2.14.3

reproducible example

Clone this repository https://github.com/AndrewRayCode/federation-test which is based off the official https://github.com/apollographql/federation-demo

Using Node 16/latest

Install

npm install

Then start services

npm run start-services

Then start the gateway in another terminal

npm run start-gateway

Then open http://localhost:4000

Verify you can run this query:

query Me {
  me {
    id
    name
  }
}

Now open services/new/index.js and update the typedefs to add:

  extend type Member @key(fields: "id") {
    id: ID! @external
    name: String
  }

According to the documentation the service that defines a field last should continue to resolve it. Instead, the gateway raises a composition error:

This data graph is missing a valid configuration. [new] Member.name -> Field "Member.name" already exists in the schema. It cannot also be defined in this type extension. If this is meant to be an external field, add the @external directive.

If you run the same query again in the playground, the whole query errors out:

"message": "This data graph is missing a valid configuration. More details may be available in the server logs."

Using this local demo I am unable to validate the claim the docs make that apollo server gracefully handles field conflicts

AndrewRayCode commented 3 years ago

This may be a dupe of https://github.com/apollographql/federation/issues/420 ?

glasser commented 3 years ago

I'm not sure if it's a dupe, but this belongs in the federation repo; I'll move it there now.

estevaolucas commented 3 years ago

I have a similar issue.

on service A I have:

extend type Member @key(fields: "id") {
  id: ID! @external
  name: String
}

and service B:

type Member @key(fields: "id") {
  id: ID! @external
  name: String
}

so I get [new] Member.name -> Field "Member.name" already exists in the schema. It cannot also be defined in this type extension. If this is meant to be an external field, add the @external directive.

Then I add @external on name

type Member @key(fields: "id") {
  id: ID! @external
  name: String @external
}

By then I get Member.name -> is marked as @external but is not used by a @requires, @key, or @provides directive.

the spec doc doesn't mention anything like that https://www.apollographql.com/docs/federation/federation-spec/#external

noglitchyo commented 2 years ago

I encounter the exact same issue. @estevaolucas did you solve your problem?

estevaolucas commented 2 years ago

I encounter the exact same issue. @estevaolucas did you solve your problem?

No! I think Apollo federation 2 is going to solve that for us.

pcmanus commented 2 years ago

Apologies for the prior lack of responses.

Field migration is definitively an issue with how current federation works, and that is one of the things federation 2 will indeed improve upon. If you try the example of this issue against federation 2 alpha, you'll see that it composes without errors and works (though if you do try it, I'll note that during the period of time where the name migrated field is defined in both subgraphs ("old" and "new"), the query planner may resolve name using either version; which ones it picks depends on the exact details of the query (and we do plan on adding https://github.com/apollographql/federation/issues/1177 in federation 2 to allow controlling this more precisely but it's not implemented yet)).

Regarding the current (0.x) federation and its documentation: the steps of the documentation should work more of less as described if you use managed federation, and if you don't want to use managed federation, it would be easier to use a "supergraph" schema in the gateway rather than using the serviceList argument. I'll note that using a "supergraph" schema is actually the setup now recommended in https://www.apollographql.com/docs/federation/gateway (and this by opposition to "composing with serviceList" described on the same page).

If you were to use a "supergraph" schema, it'll allow to decorelate updating subgraphs and updating the "view" of the subgraphs used by the gateway. To sum it up (and largely rephrasing the doc), you'd have to:

  1. modify the example to use a supergraph schema (which you'd have to first generate with rover, details in https://www.apollographql.com/docs/federation/gateway).
  2. once that runs, you can add the name field to the "new" subgraph and restart that subgraph. At that point you would not update the gateway: trying to compose the updated schema will fail with the error you've been seeing. So essentially, the gateway will continue running assuming the "new" subgraph does not have name even though it does. And in particular, name will still be resolved against the "old" subgraph.
  3. now you'd need to remove name from the schema of the "old" subgraph but without restarting the old subgraph to use that updated schema (because the gateway still resolve name from there). But what you can do now is compose the schema again with rover, and you can "deploy" the updated supergraph (restarting the gateway to use it).
  4. only then can you restart the "old" subgraph to use its new schema (with name removed) and complete the migration.

I'll note that managed federation works very similarily as it uses a supergraph schema as well.

But what if you want to use serviceList? Well, the only way I'm able to make this work is by using the experimental_pollInterval option of the gateway so that it polls subgraphs for updates. But that will earn you a warning on startup as using polling with serviceList is actually discouraged: while it happens to help here, it also make it easy to shoot yourself in the foot (and so I don't really recommend that option; switching to a supergraph schema is probably overall safer, if a tad more effort).

Anyway, if you do use experimental_pollInterval, then you can roughly use the steps of the documentation (add name to "new", wait for the poll interval, remove name from "old") but you should not restart the gateway while doing so. I mention this because the example linked in the description uses nodemon for the gateway and that silently restart the gateway, getting in the way. I'll note that after having added name to "new" you will get an error from the gateway but the gateway will essentially ignore it and continue running with the prior schema: this is why polling help here, if you restart the gateway, it fails composing the now updated schema but has nothing to fall back into.

Again, all this is far from ideal and that's why it's been reconsidered for federation 2 so that it works more "out of the box".

AndrewRayCode commented 2 years ago

@pcmanus I'm not sure what you're referring to with supergraphs and polling, https://www.apollographql.com/docs/federation/v2/entities/ doesn't mention experimental_pollInterval and still says that you can have "graceful" field migration without federation, which you can't. Could you review https://github.com/apollographql/federation/pull/751/files to update the incorrect documentation to show this has never been supported without managed federation?

From looking at the billing page, it doesn't look like the incorrect docs are directly related to users who purchase Apollo tools, which is good. It still seems like a gray area to have docs claiming you can do something that doesn't work, with incorrect docs for (months? years?) a system that you can buy a paid product on top of.

pcmanus commented 2 years ago

@AndrewRayCode : I understand the frustration and yes, the document is essentially incorrect on that front, and it's a shame it's been so for so long.

And to be clear, I'm not trying to suggest that the documentation isn't broken, it is, and it should be fixed.

But I am also saying that you can have field migrations even without managed federation, it just requires a different setup than the one currently described by the documentation, and that setup is even the one currently recommended by https://www.apollographql.com/docs/federation/gateway/ (which again, is not a managed setup).

I also mentioned that even in a setup where serviceList is used, you can technically do a field migration "if" you enable polling. Yes the polling part isn't mentioned in the doc, and it is not even a recommended setup (so even once the documentation is updated, this part may not get documented), but it "might" serve as a short-term workaround if used carefully, so wanted to at least mention it.

To sum up, totally agree the doc needs fixing, and I'm trying to help make that happen (but it may take a few days due to tanksgiving), but changing it to say that field migration cannot be done without managed federation is also not technically correct, so I'd rather get it accurate this time around.

As an aside, and just to make sure we avoid any confusion, you pointed to the documentation for federation 2 (the v2 part in the url), and that doc is definitively not up to date, and this because federation 2 is in alpha and most parts of the doc haven't been updated yet.

StephenBarlow commented 2 years ago

Hello, apologies for this long-standing error in the federation docs. The content has now been updated to reflect migration processes for different composition methods in Federation 1. (The Federation 2 alpha docs will receive a future update, but as @pcmanus mentions above, the migration process is generally much more friendly.)