microsoft / DockerTools

Tools For Docker, including Visual Studio Provisioning and Publishing
Other
175 stars 26 forks source link

Tooling generated dev cert needs a better cleanup story #362

Open adityamandaleeka opened 2 years ago

adityamandaleeka commented 2 years ago

The discussion at https://github.com/dotnet/aspnetcore/issues/44685 provides context.

Essentially, we (ASP.NET Core) have had multiple reports from users who faced an issue with certs after using the docker tooling. The tooling created a certificate for them and added a user secret (Kestrel:Certificates:Development:Path) which causes Kestrel to attempt to load the cert from the specified location.

The issue is that if the user later tries to run that app outside of Docker and the cert has become invalid, they have no way of knowing what to do. The dotnet dev-certs https --clean command doesn't know how to clean up these certs, so essentially they're stuck unless they search online and stumble upon that GitHub issue.

It would be great to improve this experience so people are not stuck. Is there any guidance for people who want to run apps which were formerly in Docker outside of Docker? We were also wondering if this user secret is needed at all.

adityamandaleeka commented 2 years ago

@NCarlsonMSFT

NCarlsonMSFT commented 1 year ago

@adityamandaleeka We've been thinking on how to do this differently w/o degrading the current user experience (for instance being able to use the cert with docker-compose up even outside our tools) and we landed on the smallest change being checking for an invalid certificate when considering if the file should be used. I have a proposed change, but I'm not sure how to test it, does this seem reasonable?

DamianEdwards commented 1 year ago

I think I just hit this issue with a slightly different flavor, in that I have a project that was launched in WSL from VS recently, which seemingly resulted in the addition of the Kestrel:Certificates:Development:Password config value in the project's user secrets file, which then changes Kestrel dev-cert loading logic even when trying to launch the project without WSL. In my case, Kestrel started picking an expired cert in the certificate store resulting in client failures. There is a valid cert right next to the expired cert, but for some reason Kestrel uses the expired cert when the password config value is present. Perhaps it's actually loading the cert from the .pfx file named after the project in %APPDATA%\ASP.NET\https?

NCarlsonMSFT commented 1 year ago

@DamianEdwards that is exactly what it is doing. The presence of the secret/config Kestrel:Certificates:Development:Password triggers attempting to use %APPDATA%\ASP.NET\https\projectname.pfx

If that certificate is not working, you can delete the certificate file or the secret, to switch back to using the global development certificate.

DamianEdwards commented 1 year ago

Thanks. Yeah we really need to look at improving this experience then as it seems far too easy to get into this state.

dazinator commented 1 year ago

I raised the original linked issue, and dismayed that it has just happened to me again. Had to re-look up this issue to figure out where the PFX was to delete it.

Not sure what the long term fix is, but perhaps having the docker tooling include an msbuild target that checks for and deletes / recreates this PFX file periodically (even if not launching in docker) based on its creation date. I think the PFX is created once to last for a long period of time, so perhaps you could add some logic to check the creation time of the file and delete it when its over X months old?

Here is mine that expired last week, creation date showing 07/07/2022

image

This won't address the issue of arguably the wrong cert being used by kestrel when running locally, but just an avenue sugested to limit some of the disruption whilst the full fix is ongoing.

NCarlsonMSFT commented 1 year ago

@dazinator The container and WSL launch targets should both be updating this certificate as needed. Are you seeing the expired error when lunching through one of those, or only when debugging on the local host?

dazinator commented 1 year ago

@dazinator The container and WSL launch targets should both be updating this certificate as needed. Are you seeing the expired error when lunching through one of those, or only when debugging on the local host?

Only when debugging on the local host. Only some members of our team launch in docker regularly. Yet due to the presence of the secret this cert is picked up by kestrel even when launching locally, as already outlined.

DamianEdwards commented 1 year ago

@adityamandaleeka have we considered @NCarlsonMSFT's proposed Kestrel change to skip invalid/expired dev certs in Kestrel?

adityamandaleeka commented 1 year ago

There was a brief discussion over email about this which didn't lead to a decision. However, @amcasey is going to be working on some related things in the cert loading space in the near future so we should get a better idea for what to do then.

krukowskid commented 9 months ago

The presence of the secret/config Kestrel:Certificates:Development:Password triggers attempting to use %APPDATA%\ASP.NET\https\projectname.pfx

@NCarlsonMSFT I'm having trouble using an autogenerated certificate and secret in my local Docker + VS development environment. The certificate is located at %APPDATA%\ASP.NET\https\projectname.pfx and the secret Kestrel:Certificates:Development:Password is stored in secrets.json. This setup only works when the ASPNETCORE_ENVIRONMENT variable is set to Development. When I change the environment name to local, I need to manually set the environment variable ASPNETCORE_Kestrel__Certificates__Development__Password for it to function. Is it possible to use this certificate without copy-pasting password to docker variables?

NCarlsonMSFT commented 9 months ago

@krukowskid this is an ASP.NET behavior. There are a number of local development helpers that are keyed off of that setting. In this case you're missing the User Secrets provider. You can add it for your environment by updating your Main method:

// Add services to the container.
builder.Services.AddControllersWithViews();
if (builder.Environment.EnvironmentName == "local")
{
    builder.Configuration.AddUserSecrets(Assembly.GetExecutingAssembly());
}

var app = builder.Build();
dazinator commented 9 months ago

There is a DOTNET_RUNNING_IN_CONTAINER env var set when running in docker. It seems like it should be possible for kestrel to use this to infer whether to load the pfx or not - instead of the presence of the user secret alone.

https://learn.microsoft.com/en-us/dotnet/core/tools/dotnet-environment-variables#dotnet_running_in_container-and-dotnet_running_in_containers

MarkStega commented 9 months ago

I just ran into this issue again https://github.com/dotnet/aspnetcore/issues/44685 and fortunately I remembered that I had an issue in aspnetcore. I certainly didn't remember the solution. It is very annoying that the tooling remains so unfriendly.

rjgotten commented 6 months ago

@NCarlsonMSFT: The container and WSL launch targets should both be updating this certificate as needed. Are you seeing the expired error when lunching through one of those, or only when debugging on the local host?

Should - but actually doesn't always. Something (don't know what) changed in the past months whereby the tooling now no longer creates/updates certificates when you debug off of a docker compose file.

Ran across this just today with a composition of many (15+ different services). The ( super annoying ) workaround to arrive at a working composed solution, was to first start each individual project separately with a basic single-project 'Docker run' target, as this luckily does still create/update certs.

NCarlsonMSFT commented 6 months ago

@rjgotten sorry for the delay in replying. Which version of VS are you using? I just tested with an old cert in 17.9.6 and both the single compose tools regenerated the certificates as expected.

rjgotten commented 6 months ago

@NCarlsonMSFT This was with v17.9.3 or v17.9.4 of VS 2022. Not entirely sure, and can't check as I updated to the latest in the mean time.

I'm going to try to set aside a moment somewhere to delete one of the certs for the solution in question where it went wrong, and see if it regenerates again for me now.

rjgotten commented 5 months ago

@NCarlsonMSFT Tried this again with VS 2022 v17.10.1 Still fails in the same way:

  1. Running the overarching Docker compose project for the solution failed to deploy the dev certificate for a microservice newly added to the solution.
  2. Running the newly added project under Docker directly, generates the certificate.
  3. Running the overarching Docker compose project works, until the certificate expires.
EklipZgit commented 6 days ago

There is a DOTNET_RUNNING_IN_CONTAINER env var set when running in docker. It seems like it should be possible for kestrel to use this to infer whether to load the pfx or not - instead of the presence of the user secret alone.

https://learn.microsoft.com/en-us/dotnet/core/tools/dotnet-environment-variables#dotnet_running_in_container-and-dotnet_running_in_containers

This seems like a good idea. It seems pretty wild that someone can click 'run in docker' just out of curiosity in visual studio a year ago, and then run in standard kestrel mode for the next year and just start getting inexplicable cert errors for certs that don't exist in the registry, with no obvious runtime logs about loading it, all based on a magic secrets file we don't know exists. Horrible experience, wasted like 2 hours of 2 devs time today.

My thoughts:

I assume DockerTools isn't the right spot for 90% of the suggestions, but every other github issue in those other areas around this that I found have been closed with a link to this issue as where it should be "solved", so seems pertinent to point out here so maybe people will see that just blaming DockerTools for 2+ years around this and failing to add any clear warnings to the dotnet tooling / Kestrel runtime were a poor choice.