randy3k / shiny-cloudrun-demo

Running Shiny app on Google Cloud Run
35 stars 13 forks source link

does this work with horizontal scaling? #1

Open maxheld83 opened 4 years ago

maxheld83 commented 4 years ago

great boilerplate, thanks so much 🙏!

Can you confirm whether this setup works with horizontal scaling (i.e. more than one container running the app at the same time)? Are requests from the same session reliably routed to the same R process in the same container?

I ask because @jchen5 brought up the issue of statefulness over in https://github.com/rstudio/shiny/issues/2455:

Wait, is GCR akin to Amazon's Lambda? If so, I imagine this won't be a good fit for Shiny, no matter what software you put in the middle. These services are designed for stateless HTTP servers, and Shiny is inherently stateful. I bet you'd end up with errors under load as requests that can only be served out of container A (where its session lives in memory) end up being routed to container B instead

(@MarkEdmondson1234 seems to wonder about the same thing in https://github.com/MarkEdmondson1234/googleCloudRunner/issues/35)

randy3k commented 4 years ago

Actually I didn’t test it, I assume it won’t work and therefore I added --max-instances 1.

maxheld83 commented 4 years ago

thanks!

I found two relevant sections in the GCR container runtime contract:

Computation is scoped to a request

You should only expect to be able to do computation within the scope of a request: a container instance does not have any CPU available if it is not processing a request.

This would seem to render async/background computations in shiny potentially inoperative, because they would not be allocated enough vCPU. I am guessing that even within the --max-instances 1 this could affect background jobs.

Stateless services

Your service should be stateless: you should not rely on the state of a service across requests because a container instance can be started or stopped at any time.

This would seem to confirm @jchen5's concern.

randy3k commented 4 years ago

I might be wrong, but I think that what a shiny app does 99% of the time is to process requests. Does shiny run any background job when there is no active user session?

Anyway, I think that this deployment option is only good for simple apps with small numbers of users.

MarkEdmondson1234 commented 4 years ago

Thanks for the ping, I will test this Docker config in googleCloudRunner. From what I understand as long as a user is within the same container session their shiny session will work ok, then as new connections are made for new users then it would be best to launch a new container via concurrency=1. The Cloud Run timeouts are also recently increased from 15mins to 60mins+ which is better for those Shiny sessions.

I run similar shiny containers of kubernetes and that has no problems with sessions.

On Cloud Run you can also assign a larger instance, which may let you have more connections, up to 80 is supported per container. I'd like to load test this with this image's repo. The tests I had before with googleAuthR's default container didn't work, perhaps it needed one of the Shiny services that is shut off in this example, which would be annoying to only have a subset of Shiny apps working on Cloud Run.

None of those sessions should share state (writing to backend etc) but I would trigger other stateless operations such as a cloud build call or similar if I needed it.

maxheld83 commented 4 years ago

@MarkEdmondson1234 what are your thoughts on this part of the container runtime contract:

Computation is scoped to a request

You should only expect to be able to do computation within the scope of a request: a container instance does not have any CPU available if it is not processing a request.

I was worried above that this might choke off any async/background computations in shiny, since there wouldn't be a request to fulfill at that time. @randy3k sorry I was unclear in the above; an example might be updating a moderately expensive model in the shiny runtime via promises.

maxheld83 commented 4 years ago

anyway, I really like Google Cloud Run for all other things R, but it does seem like it might cause more problems than it solves for mainstream shiny use (as @randy3k warned).

I wish there was a fully-managed solution that scaled down to 0 instances. My shiny apps often do not see any use for weeks on end.

randy3k commented 4 years ago

A shiny web app would constantly send requests from/to the server as long as the session is active. I am not worried about the contract.

My shiny apps often do not see any use for weeks on end.

Then you don't need horizontal scaling.

randy3k commented 4 years ago

A bigger concern is whether google routes requests to the same container for the same user if multiple containers are spawned. For this, I have no answer.

roomjan commented 4 years ago

Then you don't need horizontal scaling.

I don't agree entirely. This scenario could benefit from scaling to 0, because that would save costs during these zero activity time windows. The big question is: Is Shiny stateful, if so, can we move that state to a external volume/datastore. If not: Then GCR is simply not a right fit for this.

randy3k commented 4 years ago

Then you don't need horizontal scaling.

I don't agree entirely. This scenario could benefit from scaling to 0, because that would save costs during these zero activity time windows. The big question is: Is Shiny stateful, if so, can we move that state to a external volume/datastore. If not: Then GCR is simply not a right fit for this.

We might have different definitions of horizontal scaling. I was referring to using more then 2 instances as horizontal scaling. Of course you are right about scaling to 0. It is also the reason I used google cloud run.

MarkEdmondson1234 commented 4 years ago

I tried it with more than one instance and it errored, so for Shiny I think we are limited to 1 max instance on Cloud Run, but 80 concurrent connections. It does scale to 0 though, and I think it suits a lot of use cases with the limits.

But you could also launch lots of Cloud Run instances with different names, and place some kind of randomised load balancer in front too so you get a hacky horizontal scale that way.