wundergraph / cosmo

The open-source solution to building, maintaining, and collaborating on GraphQL Federation at Scale. The alternative to Apollo Studio and GraphOS.
https://cosmo-docs.wundergraph.com/
Apache License 2.0
734 stars 105 forks source link

Running the Router without any other components #243

Closed Tehnix closed 11 months ago

Tehnix commented 1 year ago

Component(s)

router

Is your feature request related to a problem? Please describe.

I'm currently building up an entirely serverless GraphQL stack, and am looking for a Federation-compatible router that I fits with this infrastructure.

I'd love to give Cosmo Router a try, but it seems to be inextricably coupled with the other Cosmo components, making it impossible to run as a standalone thing (I might be wrong here). E.g. it needs to have a Control Plane to function, which would require both more services running (Control Plane doesn't seem like a thing you'd want to run in Lambda) and also means there is more stuff happening on the Router startup (registering to the Control Plane etc I guess).

I've done some examples here of various other Routers in AWS Lambda (Apollo Router needed some wrangling), and the story is generally not great when it comes to Cold Starts (1-1.5 seconds in general).

Describe the solution you'd like

I would love to be able to use the Cosmo Router completely on its own. I can already create the composed Supergraph Schema and provide that statically to the Router, but it still requires other services and e.g. a GRAPH_API_TOKEN.

Since it's a binary built with Go it would fit great in a serverless environment, with fast startup times and good performance.

Ideally, it would also expose a Lambda handler, but I don't mind getting creative and either taking a similar approach is with the Apollo Router in Lambda PoC or something else.

Describe alternatives you've considered

Alternative would be to not run it in a serverless environment, but that's a key part of the goal so not really an alternative šŸ¤”

Additional context

N/A

github-actions[bot] commented 1 year ago

WunderGraph commits fully to Open Source and we want to make sure that we can help you as fast as possible. The roadmap is driven by our customers and we have to prioritize issues that are important to them. You can influence the priority by becoming a customer. Please contact us here.

Slickstef11 commented 12 months ago

Hey @Tehnix just sent you over an email. would love to hop on a call and chat.

Akos-T commented 12 months ago

Hi,

Could you please share the outcome of that discussion here when it happens?

As a newcomer, the projects seems really interesting, but also super complicated to start with. For example if I'm just doing some experimentation and don't need observability, I don't want to start up click house, telemetry collectors, etc. I just want to compose the supergraph from my subgraphs and make them available through the gateway (router). Either by using a package in Go or running a binary/docker container with a configuration. Then, if the project gets bigger, incrementally add observability and other features.

Thank you!

Slickstef11 commented 12 months ago

Hey @Akos-T my apologies! I should put it here for everyone to see.

So you actually can run the router by itself without any of the other components.

See here: https://cosmo-docs.wundergraph.com/tutorial/mastering-local-development-for-graphql-federation

We also have this new zero to federation guide that could be of use: https://cosmo-docs.wundergraph.com/tutorial/from-zero-to-federation-in-5-steps-using-cosmo

Let me know how it goes.

Akos-T commented 11 months ago

Hi,

Looks promising, I missed those before. Thank you!

All the best,

Aenimus commented 11 months ago

@Tehnix @Akos-T Please kindly let me know how you get on so that I can close this if it is resolved by the documentation linked.

StarpTech commented 11 months ago

Hi @Akos-T

As a newcomer, the projects seems really interesting, but also super complicated to start with. For example if I'm just doing some experimentation and don't need observability, I don't want to start up click house, telemetry collectors, etc. I just want to compose the supergraph from my subgraphs and make them available through the gateway (router). Either by using a package in Go or running a binary/docker container with a configuration. Then, if the project gets bigger, incrementally add observability and other features.

I understand. You can compose the router config as outlined before here or you can use our SaaS offering https://cosmo.wundergraph.com/login, which offers a generous free tier of 10 million requests per month. We would be happy to hear back from you regarding feedback on the onboarding experience.

Tehnix commented 11 months ago

@Aenimus it indeed worked, thanks for the help on this!

To test it out without needing too many changes I've currently got a PoC here which is a little Rust program that:

A better solution would probably be to build a custom router a bit in the same way, but starting up the server directly in the Go code and then use the Go Lambda runtime :)

I am able to get decent cold start times, although I'd still like to squeeze a bit more out of it:

Measurement (ms) 1024 MB
Average warm start response time 20.8 ms
Average cold start response time 639.4 ms
Fastest warm response time 18.9 ms
Slowest warm response time 23.9 ms
Fastest cold response time 628 ms
Slowest cold response time 1000 ms
jensneuse commented 11 months ago

Hey @Tehnix, these stats don't sound too bad. That said, we've never tried to make Cosmo Router start fast and I could definitely see some potential improvements to pre-load some things, defer others, or maybe even move some heavy things from startup time to build time. However, this is currently not a focus of ours as usually, people keep the Router running. If you want to collaborate a bit on investigating cold start times and improve this further, please let me know. Are you going to add Cosmo Router to the results page where you currently have Apollo Gateway and Mesh?

Tehnix commented 11 months ago

@jensneuse

Are you going to add Cosmo Router to the results page where you currently have Apollo Gateway and Mesh?

Yup! I've just updated the benchmarks both with a comparison of various memory sizes in the intro and its own section. It would be my recommended option atm, given how few hacks I need for the performance it has.

The Lambda itself doesn't spend terribly long (200-300ms) waiting for the Cosmo Router to spin up, so the startup is actually quite fast already (I can of course always dream of more).

The size of the binary (~30 MB) does have an impact. An interesting thing I found in the Apollo variants was that there was a significant benefit from optimizing for size (~40MB) over speed (~70MB). Cosmo has a much better starting point for that.

That said, we've never tried to make Cosmo Router start fast and I could definitely see some potential improvements to pre-load some things, defer others, or maybe even move some heavy things from startup time to build time. [..] If you want to collaborate a bit on investigating cold start times and improve this further, please let me know.

My plan was to try out building a custom router as the next PoC which will allow me to rely on the AWS Lambda Go Runtime instead of the generic AL2023 one, as well as avoid the hacks I'm doing of staring the router using the binary.

I was curious if you export any way to feed a GraphQL request directly into the Router and get the response? That is, calling it as a function instead of spinning up a whole server that one communicates with

For the Apollo variant I abused their TestHarness which is used in their own tests. A bit hacky, but everything will be hacky when it's not natively supported šŸ˜

However, this is currently not a focus of ours as usually, people keep the Router running.

Totally understandable! I do imagine I'm walking a bit of an uncommon path, so I don't mind jumping through hoops a bit šŸ˜€

jensneuse commented 11 months ago

@Tehnix I think what might improve things is if we used the Lambda Go SDK and pulled out some of the internals of Cosmo Router and directly attached them to a Lambda event handler in the most optimal way. Currently, there's probably a lot of overhead that's not beneficial in a Serverless environment which is not designed to be cold-start friendly. That said, it's not currently a priority for us. ;)

Tehnix commented 11 months ago

Just wanted to update in this thread as well: I ported my Rust PoC to a Go solution that pulls in Cosmo Router as a library and keeps everything in Go https://github.com/codetalkio/apollo-router-lambda/tree/main/lambda-cosmo-custom

Super great results! Cold Starts are now at 300ms - 580ms which is much better than I had hoped for. For reference, the fastest a Cold Start will ever be is ~200ms plus/minus a bit, so this is not far off :)

Overall measurements (using ARM Lambda's):

Measurement (ms) 512 MB Memory 1024 MB Memory 2048 MB Memory
Average warm start response time 10.7 ms 10 ms 9.8 ms
Average cold start response time 442.9 ms 464.7 ms 427.7 ms
Fastest warm response time 6.9 ms 7.9 ms 7.9 ms
Slowest warm response time 19 ms 11.9 ms 10.9 ms
Fastest cold response time 328 ms 328 ms 328 ms
Slowest cold response time 581 ms 531 ms 505 ms