dotnet / aspire

An opinionated, cloud ready stack for building observable, production ready, distributed applications in .NET
https://learn.microsoft.com/dotnet/aspire
MIT License
3.65k stars 416 forks source link

VS hangs until Docker is manually resumed from sleep state when launching AppHost project #2075

Open DamianEdwards opened 7 months ago

DamianEdwards commented 7 months ago

Docker for Windows is automatically configured to suspend the Docker engine after a period of no use. When launching an Aspire AppHost project from Visual Studio and selecting "OK" when prompted by VS to start Docker Desktop, VS indefinitely waits for Docker to "start" but Docker remains in the suspended state. Going to the Docker system tray icon and selecting the item to resume Docker engine actually resumes Docker and unsticks VS.

We should investigate if VS can alter how it starts Docker to include resuming it properly from its sleep state.

joperezr commented 7 months ago

@vijayrkn will take another look, check if this is a regression and then determine the milestone for this one.

bradygaster commented 7 months ago

also fyi @spboyer

savannahostrowski commented 7 months ago

@DamianEdwards You might be running into a bug on the Docker side. That said, can you share more about what doesn't work when the engine is in a suspended state to help us debug?

karolz-ms commented 7 months ago

@savannahostrowski there are two reasons/use cases why we need to ensure that Docker is installed and running

First, if the Docker is not installed, and Aspire orchestrator tries to run a program that is using containers without any additional checking by the tooling, the user will get an error to the effect of The term "docker" is not recognized as a name of a cmdlet, function, script file, or executable program. This is pretty bad user experience.

Second, in the past, it was somewhat regular occurrence that the machine went into hibernate mode and when it came back, Docker appeared/reported to be running, but actually was not responding at all, any CLI command would just hang indefinitely.

That is why VS needs a way to check that Docker is installed and it is responsive. @BillHiebert or @NCarlsonMSFT can probably provide details on what is the Docker liveness check that Visual Studio performs.

In Aspire host code we run docker ps --latest --quiet and assume Docker is missing/unhealthy if this command does not return without error within generous time limit (25 seconds currently). If this is not the recommended way to check Docker responsiveness, please advise what is. We need a command that interrogates the whole stack E2E, including the Docker daemon and VM, so please, no "clever", cached/fake results.

ctalledo commented 7 months ago

Hi @karolz-ms,

This is Cesar from Docker; I helped a bit on the resource saver feature.

n Aspire host code we run docker ps --latest --quiet and assume Docker is missing/unhealthy if this command does not return without error within generous time limit (25 seconds currently)

I think that command is fine to verify that Docker is present and running.

We need a command that interrogates the whole stack E2E, including the Docker daemon and VM, so please, no "clever", cached/fake results.

If I may ask, why is docker ps --latest --quiet not sufficient?

For background, Docker Desktop's resource saver mode (which puts the Docker Desktop VM to sleep) is supposed to be transparent to users. That is, Docker automatically enters and exits the mode as needed, without any special actions from users.

It enters the mode when there are no containers running for a user-configurable period of time (default 5 minutes); and it exits the mode when a Docker command that requires action from the engine is executed (e.g., docker run, docker volume create, docker network create, etc). As an optimization, while in resource saver mode, Docker commands that query information from the engine generally do not trigger an exit from resource saver mode; rather the information is cached before entry into resource saver mode and returned from the cache. This avoids unnecessarily waking up the Docker Desktop VM.

But having said all of this, again, the entry/exit from resource saver mode is supposed to be transparent to users, so not sure why it needs to be explicitly woken up.

Hope that helps!

karolz-ms commented 7 months ago

@ctalledo thank for the explanation, that helps quite a bit.

To answer your question, whatever command we use to assess Docker health, if that command succeeds, I would like to have a high confidence that subsequent Docker CLI commands will succeed, or if they fail, they do because the problem lies with Aspire app. In other words, it is a Docker health check. If Aspire tooling can tell the difference and warn the user about Docker being in a bad state, our developers will not go on a wild goose chase wondering why their services do not work, only to eventually discover that (containerized) service dependencies did not initialize properly because they did not have the Docker infra ready to support them.

At the end of the day, if you tell me that Docker resource saver mode is an implementation detail that will not affect the result of the health check command, you are the expert and I am going to trust you. But the more of the Docker infra is excluded from the health check, the more nervous I will be about it.

Also please note in the Aspire use case delaying the VM wake up during the health check command is not going to improve the performance because if the "health check" succeeds, a series of docker run and docker events commands is surely to follow very soon after. But the engine does not know that, so I am thinking we might consider being more explicit about the health check command, especially if you share some of my reservation about a health check command using cached data. I can think of several options. Fo example, we could introduce a new docker system subcommand that represents a health check request.

Let me know please what are your thoughts on the above. Thanks again.

ctalledo commented 7 months ago

Hi @karolz-ms,

Thanks for the extra context, very helpful too.

So the official answer is "Docker Desktop resource saver mode is meant to be transparent for users, and therefore docker ps or any other command (except client-side-only commands such as docker context or docker login) should be sufficient to indicate that the engine is healthy".

Unofficially however, if you want a command that checks as much of Docker is up (meaning that it causes Docker Desktop to exit resource saver mode and therefore wake-up the VM and boot the engine inside), but without creating any resources (containers, networks, etc.), I am thinking docker system df would be a good alternative. There's no guarantee however that in the future that command won't be cached.

For example, we could introduce a new docker system subcommand that represents a health check request.

Yes, something we could consider; I'll bring that idea in-house to see what the team says.

Hope that helps. Thanks!

karolz-ms commented 7 months ago

Super, thank you @ctalledo ! We will experiment with docker system df and let you know what we learn.

jjhayter commented 7 months ago

Honestly I don't agree that Visual Studio should be doing this at all. Docker Desktop made the feature, which I promptly disabled. This docker feature was one I never wanted. Just my 2 cents.

karolz-ms commented 6 months ago

@ctalledo we did test docker system df and it works well. We are going to use it for the purpose of verifying whether Docker is in good shape.

@DamianEdwards can you still reproduce the issue?

philliphoff commented 1 week ago

@karolz-ms Just ran into this myself (or, rather, just made the correlation between Aspire start errors and Docker's "resource saver" mode). Had me puzzled for a while.