Closed dv00d00 closed 3 years ago
I am also curious about what was going on with thread pool starvation in my case. Am I right saying that all requests on a fresh/unvalidated certificate ended up querying cert status from LetsEncrypt servers and starved threadpool?
Or the certificate checks were made on a per request basis?
Am I right saying that all requests on a fresh/unvalidated certificate ended up querying cert status from LetsEncrypt servers and starved threadpool?
Even though the full chain is being requested we still validate it but this is what happens:
X509Certificate2
type which only grabs the first https cert not the full chain (we don't have a way to represent the validated chain)All badness.
Thanks for the review guys, both PRs (https://github.com/natemcmaster/LetsEncrypt/pull/81 and https://github.com/ffMathy/FluffySpoon.AspNet.LetsEncrypt/pull/71) are merged and will be released soon. I am not aware of any other community projects aimed at acme cert providers.
This whole story rendered that SSL could be tricky and assuming that SslStream internals will change in the future this could lead to other similar issues.
Idk, I'd be happy to see support from the dotnet team on the LestsEncrypt side. Having SSL benchmarks with popular community libs is a great starting point as @davidfowl suggested. Dotnet foundation membership is better. Builtin support is best imo.
This issue could be closed I believe.
Really interested if there will be any further steps and if I can help somehow.
This was a wild ride.
This whole story rendered that SSL could be tricky and assuming that SslStream internals will change in the future this could lead to other similar issues.
Yes this great issue investigation spawned a set of work that I have tracked here https://github.com/dotnet/aspnetcore/issues/21512. 5.0 should have this situation dramatically improved.
@bartonjs is our certificate crypto export and we're looking at ways to represent a "pre-validated" certificate chain for these scenarios to avoid this in the future.
@davidfowl Do we believe this issue can be closed now for 5.0?
Sorta, but I need to merge this https://github.com/dotnet/aspnetcore/pull/24935 for it to be done done
Closing this as the major bug as been addressed here. Will follow up with this change https://github.com/dotnet/aspnetcore/pull/24935 later
Describe the bug
I am running Kestrel as an edge server in the Digital Ocean (
Ubuntu Docker 5:19.03.1~3 on 18.04
) via docker-composeThe container is built with
mcr.microsoft.com/dotnet/core/sdk:3.1
andmcr.microsoft.com/dotnet/core/aspnet:3.1
I am using Compression and ResponseCaching middlewares in the request pipeline.
The issue was not appearing before we have started receiving increased volume of traffic (eg before 1 rps after 8 rps).
Deployment process loads latest commit from repo, builds container on the host and launches new instance
docker-compose -f prod.yml up -d --build
This process restarts the running Kestrel container and after the restart, newly started instance is not handling any requests.
Cpu is low during this period (normal avg 10%, broken avg 10%).
After a series of reboots server starts to handle requests again.
To Reproduce
I am able to consistently reproduce the issues with the syntetic traffic on our staging env:
While the fake load is running I am shutting the stack down and bringing it up again. Repro rate is around 90%
Further technical details
dotnet --info
:.NET Core SDKs installed: No SDKs were found.
.NET Core runtimes installed: Microsoft.AspNetCore.App 3.1.1 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App] Microsoft.NETCore.App 3.1.1 [/usr/share/dotnet/shared/Microsoft.NETCore.App]