Closed rezoled closed 5 months ago
A 10Mb supergraph schema doesn't sound like a lot. Suspect something else is going on. Are you using managed federation?
Not managed, running the router binary on my machine
Does increasing the log level give any clues? Otherwise I could see us adding some debug statements to try and narrow this down.
Are you running a stock binary or something that is customized?
Stock, I tried running with trace logs and didn’t see anything unusual. I’ll run it again and post the logs.
{"timestamp":"2023-08-04T10:11:23.613674Z","level":"INFO","message":"Apollo Router v1.26.0 // (c) Apollo Graph, Inc. // Licensed as ELv2 (https://go.apollo.dev/elv2)","target":"apollo_router::executable"}
{"timestamp":"2023-08-04T10:11:23.615220Z","level":"INFO","message":"Anonymous usage data is gathered to inform Apollo product development. See https://go.apollo.dev/o/privacy for details.","target":"apollo_router::executable"}
{"timestamp":"2023-08-04T10:11:23.645797Z","level":"INFO","message":"Running with *development* mode settings which facilitate development experience (e.g., introspection enabled)","target":"apollo_router::configuration::expansion"}
{"timestamp":"2023-08-04T10:11:23.659379Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.include_subgraph_errors","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.659448Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.rhai","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.659465Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"experimental.restricted","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.659469Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"experimental.broken","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.659644Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.headers","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.659657Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.coprocessor","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.659837Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.subscription","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.660013Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.forbid_mutations","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.660036Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.csrf","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.660043Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.authentication","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.660074Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.telemetry","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.660079Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.override_subgraph_url","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.660083Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"experimental.expose_query_plan","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.660102Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.traffic_shaping","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.660107Z","level":"DEBUG","message":"creating plugin factory","plugin_factory_name":"apollo.authorization","target":"apollo_router::plugin"}
{"timestamp":"2023-08-04T10:11:23.712806Z","level":"DEBUG","message":"starting","target":"apollo_router::state_machine"}
{"timestamp":"2023-08-04T10:11:23.714150Z","level":"DEBUG","message":"state machine event: UpdateLicense(Unlicensed), transitioned from: Startup to: Startup","target":"apollo_router::state_machine"}
{"timestamp":"2023-08-04T10:11:23.714819Z","level":"DEBUG","message":"state machine event: NoMoreLicense, transitioned from: Startup to: Startup","target":"apollo_router::state_machine"}
{"timestamp":"2023-08-04T10:11:23.714988Z","level":"INFO","message":"Running with *development* mode settings which facilitate development experience (e.g., introspection enabled)","target":"apollo_router::configuration::expansion"}
{"timestamp":"2023-08-04T10:11:23.726974Z","level":"DEBUG","message":"state machine event: UpdateConfiguration(<redacted>), transitioned from: Startup to: Startup","target":"apollo_router::state_machine"}
{"timestamp":"2023-08-04T10:11:24.116748Z","level":"TRACE","message":"recursion limit data","recursion_limit":"recursion limit: 4096, high: 0","target":"apollo_router::spec::schema"}
{"timestamp":"2023-08-04T10:11:24.540073Z","level":"DEBUG","message":"A valid Apollo license was not detected. However, no restricted features are in use.","target":"apollo_router::state_machine"}
{"timestamp":"2023-08-04T10:11:24.917883Z","level":"TRACE","message":"recursion limit data","recursion_limit":"recursion limit: 4096, high: 0","target":"apollo_router::spec::schema"}
{"timestamp":"2023-08-04T10:17:05.099384Z","level":"TRACE","message":"recursion limit data","recursion_limit":"recursion limit: 4096, high: 0","target":"apollo_router::spec::schema"}
{"timestamp":"2023-08-04T10:17:05.441648Z","level":"DEBUG","message":"configuring Apollo metrics","target":"apollo_router::plugins::telemetry::metrics::apollo"}
{"timestamp":"2023-08-04T10:17:05.443025Z","level":"DEBUG","message":"configuring Apollo tracing","target":"apollo_router::plugins::telemetry::tracing::apollo"}
{"timestamp":"2023-08-04T10:17:05.444802Z","level":"DEBUG","message":"plugins list: [\"apollo.include_subgraph_errors\", \"apollo.csrf\", \"apollo.headers\", \"apollo.telemetry\", \"apollo.traffic_shaping\", \"experimental.expose_query_plan\"]","target":"apollo_router::router_factory"}
{"timestamp":"2023-08-04T10:17:09.724676Z","level":"INFO","message":"Health check endpoint exposed at http://127.0.0.1:8088/health"}
{"timestamp":"2023-08-04T10:17:09.729015Z","level":"INFO","message":"GraphQL endpoint exposed at http://127.0.0.1:3333/graphql 🚀"}
{"timestamp":"2023-08-04T10:17:09.729477Z","level":"DEBUG","message":"extra endpoints the router listens to","tracing_endpoints":"http://127.0.0.1:8088"}
{"timestamp":"2023-08-04T10:17:09.730556Z","level":"INFO","message":"You're using some \"experimental\" features of the Apollo Router (those which have their configuration prefixed by \"experimental_\").\nWe may make breaking changes in future releases. To help us design the stable version we need your feedback.\nHere is a list of links where you can give your opinion:\n\n\t- experimental_response_trace_id: https://github.com/apollographql/router/discussions/2147\n\nFor more information about launch stages, please see the documentation here: https://www.apollographql.com/docs/resources/product-launch-stages/"}
{"timestamp":"2023-08-04T10:17:09.891406Z","level":"DEBUG","message":"state machine event: UpdateSchema(<redacted>), transitioned from: Startup to: Running"}
{"timestamp":"2023-08-04T10:17:10.134615Z","level":"TRACE","message":"recursion limit data","recursion_limit":"recursion limit: 4096, high: 0"}
{"timestamp":"2023-08-04T10:17:10.558514Z","level":"DEBUG","message":"sending anonymous usage data to Apollo"}
{"timestamp":"2023-08-04T10:17:10.629829Z","level":"DEBUG","message":"transmitting anonymous analytics: {\n \"session_id\": \"9d99f55a-ab4f-46cf-8db6-759c933ef043\",\n \"version\": \"1.26.0\",\n \"platform\": {\n \"os\": \"macos\",\n \"continuous_integration\": null\n },\n \"usage\": {\n \"args.log_level.<redacted>\": 1,\n \"configuration.headers.all.request.propagate.len\": 1,\n \"args.supergraph_path.<redacted>\": 1,\n \"configuration.telemetry.tracing.experimental_response_trace_id.enabled.true\": 1,\n \"configuration.plugins.len\": 0,\n \"configuration.supergraph.listen.len\": 2,\n \"configuration.supergraph.listen.<redacted>\": 3,\n \"args.dev.true\": 1,\n \"configuration.headers.all.request.propagate.matching.<redacted>\": 1,\n \"configuration.homepage.enabled.false\": 1,\n \"configuration.supergraph.introspection.true\": 1,\n \"configuration.supergraph.path.<redacted>\": 1,\n \"configuration.include_subgraph_errors.all.true\": 1,\n \"args.config_path.<redacted>\": 1,\n \"configuration.sandbox.enabled.true\": 1,\n \"configuration.cors.allow_any_origin.true\": 1\n }\n}"}
6 mins again
@rezoled would it be ok to share that schema with us so we can look into it?
This is our company's IP, we can only share it under NDA agreement... Any chance of getting a custom binary with added logs? You can see that between these 2 log messages:
{"timestamp":"2023-08-04T10:11:24.917883Z","level":"TRACE","message":"recursion limit data","recursion_limit":"recursion limit: 4096, high: 0","target":"apollo_router::spec::schema"}
{"timestamp":"2023-08-04T10:17:05.099384Z","level":"TRACE","message":"recursion limit data","recursion_limit":"recursion limit: 4096, high: 0","target":"apollo_router::spec::schema"}
There are 6 minutes of log silence, when obviously something is begin done in the background
This is our company's IP, we can only share it under NDA agreement... Right, we can avoid this for now Any chance of getting a custom binary with added logs? maybe not right now, first we'd have to narrow it down
So looking at the logs:
{"timestamp":"2023-08-04T10:11:23.726974Z","level":"DEBUG","message":"state machine event: UpdateConfiguration(<redacted>), transitioned from: Startup to: Startup","target":"apollo_router::state_machine"}
{"timestamp":"2023-08-04T10:11:24.116748Z","level":"TRACE","message":"recursion limit data","recursion_limit":"recursion limit: 4096, high: 0","target":"apollo_router::spec::schema"}
{"timestamp":"2023-08-04T10:11:24.540073Z","level":"DEBUG","message":"A valid Apollo license was not detected. However, no restricted features are in use.","target":"apollo_router::state_machine"}
{"timestamp":"2023-08-04T10:11:24.917883Z","level":"TRACE","message":"recursion limit data","recursion_limit":"recursion limit: 4096, high: 0","target":"apollo_router::spec::schema"}
{"timestamp":"2023-08-04T10:17:05.099384Z","level":"TRACE","message":"recursion limit data","recursion_limit":"recursion limit: 4096, high: 0","target":"apollo_router::spec::schema"}
{"timestamp":"2023-08-04T10:17:05.441648Z","level":"DEBUG","message":"configuring Apollo metrics","target":"apollo_router::plugins::telemetry::metrics::apollo"}
The first mention of "recursion limit data" comes from here: https://github.com/apollographql/router/blob/37daffdac4ebcba46460d779aa04b097db58bdf9/apollo-router/src/state_machine.rs#L316-L318 There's no 6mn wait right after that. And this is all happening before loading the plugins, so most likely the wait happens in here:
So either when creating the query planner or when generating the API schema. Could you give us some info about the schema's shape? How many subgraphs, types, fields? Do you use Contracts?
oh, also, which router and gateway versions are you testing this with?
Sure. Subgraphs - 14 Not using contracts. Interfaces - 128 Types - 7,647 Inputs - 7,086 Enums - 702
I'm not entirely sure how to count fields, but there are more than 10K.
I downloaded Router from this command: curl -sSL https://router.apollo.dev/download/nix/latest | sh
, I'm assuming 1.26.
any idea?
that's a lot of input types, that might be the cause of inefficiencies there
We have 100s of entities which can be queried, paged, and filtered
could you try to add this to your configuration:
experimental_graphql_validation_mode: new
@rezoled friendly ping on this! Did the new validation mode help for your issue?
@rezoled I will close this now, feel free to reopen if you see it happen again
Describe the bug We have an almost 10MB supergraph schema in our application, running Router with minimal configuration takes 6 mins until it starts accepting requests.
To Reproduce Kind of difficult to reproduce without supplying my org's schema, and I can't do that.
Expected behavior The JS based federated gateway works much faster, I expected the rust based router to be faster than the JS implementation.
Output
Desktop (please complete the following information):
Additional context log: