Azure / azure-cosmos-dotnet-v3

.NET SDK for Azure Cosmos DB for the core SQL API
MIT License
744 stars 496 forks source link

Add support for ignoring SSL certificate for emulator connections #4222

Closed mitchdenny closed 10 months ago

mitchdenny commented 11 months ago

Is your feature request related to a problem? Please describe. Currently when using the Cosmos DB emulator it is necessary to either download the certificate from the emulator using a well known endpoint and install it into the certificate store, or override the HttpClient to ignore the certificate. I am proposing that the format of the Cosmos DB client connection string be updated to support ignoring the certificate. This is a similar pattern to what SQL Server follows.

Describe the solution you'd like From a users perspective, ignoring the Cosmos DB emulator certificate should be as simple as appending:

AccountEndpoint=https://localhost:8081/;AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==;IgnoreEndpointCertificate=true;

The default value would be false. Note that the implementation should not restrict ignoring the certificate to the default emulator endpoint because some tools such as .NET Aspire may randomly assign a proxy port in front of the emulator.

Describe alternatives you've considered The current recommendations (installing the certificate) or having developers write code to ignore the SSL check is cumbersome and easy to mess up.

Additional context This request is coming from the .NET Aspire team. We've been working with @Pilchie on adding Cosmos support to Aspire and this was one of the stumbling blocks that we came up against.

josephaw1022 commented 11 months ago

πŸ™ŒπŸ™Œ

bartelink commented 11 months ago

This would be extremely useful. Some extensions/notes:

i.e. the ideal implementation should allow me to:


I implemented something in this direction in https://github.com/jet/equinox/pull/431 (but I did not use it as it hangs during CreateDatabaseIfNotExistAsync in my current implementation in Direct mode, whereas that works OK if I trust the cert - its very important to me that I don't use Gateway mode in all my testing and then switch to Direct when I actually want to run things for me, so I opted to stay with actually trusting the cert via the MacOS Keychain; would love a hint as to what I did wrong, but ideally I'd just use this work when it ships, if it can tick the above boxes...)

fakhrulhilal commented 10 months ago

This will be extremely useful. I can tell you why. I know I can import the self signed cert into my local development machine. But it will be a problem for other environment, let's say a docker container. Yes, I can ignore the certificate warning in the code. But I will do that for Debug mode only. For building docker container, I use Release mode, which I don't prefer for ignoring certificate warning. By supporting the configuration, then I can separate the connection string between my local machine, docker container, and production (which I have been doing at the moment). Currently, I have to use dotnet run for my development env, regardless the other services use docker container. The only reason is avoiding self-sign certificate problem. Because there is no option to override the Azure Cosmos certificate. I tried before but I just gave up.

bartelink commented 10 months ago

@mitchdenny @sourabh1007 would you see any value in also implementing a AccountKey=EmulatorKey expansion to make the string human readable too?

Also can you confirm the aim is to support Direct and Gateway modes equally?

sourabh1007 commented 10 months ago

@bartelink What do you mean by human readable string here? Connection String would be same with one more extra information (which is human readable). Yes, it will support Direct and Gateway mode both, until unless you are using own HttpClientFactory instance because if custom httpclient is used, then we won't have control over it.

bartelink commented 10 months ago

@sourabh1007 As in the linked comment from me I'm saying it would be nice to do one extra thing to make the connection string more legible:

When Parsing the AccountKey, replace:

AccountKey=EmulatorKey

with:

AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==

This would mean that the string

AccountEndpoint=https://localhost:8081;AccountKey=EmulatorKey;IgnoreEndpointCertificate=true

should work correctly for a default configured emulator

While I get that the AccountKey is not "not human readable", having a shorter thing that's easy to verify as being correct, and easy to type in a script is helpful compared to having to constantly look up and copy the well known key.

sourabh1007 commented 10 months ago

We don't want to tie up SDK with emulator with such conditions. For SDK, emulator is just another cosmos db account which is hosted on local (or docker container).

What if tomorrow emulator key is changed? all the old versions of SDK will stop supporting emulator. SDK is not supposed to maintain the key (even the emulator).

@kirankumarkolli @ealsur

bartelink commented 10 months ago

Yeah, I can definitely see the negatives in taking a dependency on that from the perspective of the SDK team - and I appreciate that there's been effort to make sure there's no emulator-specific hacks in the SDK.

But... the planet does have a lot of copy and pastes of the existing ones from the docs. I'm perpetually having to chase about to find search criteria to hit the page that lists it.

Also, most secret scanners have exclusions for the current key being committed to source control - minting a new one is definitely not something that the Emulator team would ever do lightly.

The bottom line is that the string you've added support for will inevitably by accompanied by the other one, and if it can be less noisy, that's a in for everyone but you...

kirankumarkolli commented 10 months ago

@bartelink it's a good suggestion. From scope perspective let's please track this as a separate issue.

A generic use-case with emulator is connectionstring is copied from emulator data-explorer which is fully self-contained. In-cases of containers that might not be true and this will help in such. @bartelink is our understanding of scenario reasonable?

Some possibilities are

EmulatorAccountKey property seems like a low handling simpler solution. Thoughts?

bartelink commented 10 months ago

So the ask is for (just as with the new flag) some extension to how the connection string parsing works on a global basis such that the key literal does not have to be included. Examples of things that would work:

While it would not be useless to have a const with the value C2y..., it would just mean that you'd use it in some parsing function external to the SDK that everyone everywhere needs to remember to call (and, for some libraries, they may embed that logic beyond reach)

bartelink commented 10 months ago

so @kirankumarkolli @sourabh1007 @ealsur any chance of the aforementioned happening as a fast follow? It's now or never ;)

kirankumarkolli commented 10 months ago

Builder from my earlier statement is a little glorified view of below construct.

   public class EnumatorEndpoint {
        public static readonly Default = "..."; // Default don't need DisableServerCertificateValidation 

        public Endpoint { get; set; }
        public DisableServerCertificateValidation {get; set;} = false;
   }

  public class CosmosClient {
      public CosmosClient(EnumatorEndpoint, CosmosClientOptions);
  }

@bartelink thoughts?

bartelink commented 10 months ago

I'm a bit confused...

My aim is to have one set of wiring in my codebase that takes a connection string; this seems to be going in the opposite direction - more APIs, more paths in my code.

In other words,iIn the same way that #4251 means I can stuff IgnoreEndpointCertificate=true into it in my keyvault and

  1. Change zero code
  2. Introduce zero risk of my prod code attempting to connect to the wrong thing
  3. remove https://github.com/jet/equinox/pull/443/files#diff-fdcf4bd1978f5948b3d65f335996a832e87dee220916633755c4ba62230e6741L1205

    I want to be able to take out the AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw== and replace it with AccountKey=EmulatorKey so that I:

    1. change zero code or have two paths
    2. don't have to look up the magic string in some docs page, or search for that page so I can figure out that the magic url is https://localhost:8081/_explorer/index.html
    3. don't trigger securty scanners

This will mean that:

If you did put a const in the SDK, I probably would write some code to do that. BUT I'd have to be sure that every place I pass a connection string that I pass it through there

In other words, a programmatic API is icing on a cake, and I'm definitel not looking for a third ctor and/or way of making a CosmosClient

bartelink commented 10 months ago

Putting it slightly differently, I'd like to achieve the complete absence of anything like:

And be able to say:


We use the new connection string features in Microsoft.Azure.Cosmos 3.38 so using the emulator instead is trivial and failsafe:

  1. to use a normal cloud instance, `export EQUINOX_COSMOS_CONNECTION =
  2. if you want to run against a local emulator
    • docker run or docker compose up the emulator
    • export EQUINOX_COSMOS_CONNECTION="AccountEndpoint=https://localhost:8081;AccountKey=EmulatorKey;IgnoreEndpointCertificate=true"
    • no code changes
    • if you have a different port, change that
    • If you have cert trust established some other way, remove the IgnoreEndpointCertificate bit

As opposed to more code outside the SDK that needs to be traversed, maintained and debugged

kirankumarkolli commented 10 months ago

@bartelink how is the emulator endpoint discovered?

Almost all places where its documented/exposed (docs, emulator data-explorer) already cover full connection string. So enriching those section with new connection-string with DisableServerCertificateValidation=true will do?

bartelink commented 10 months ago

Yes, enrich with that.

But also put a small piece of logic that effectively does a var connectionString = connectionString.Replace("AccountKey=EmulatorKey","AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==") such that all usages get substituted and one would never have any code in your app or your scripts specific to emulator scenarios

And you never need to scroll or check because AccountEndpoint=https://localhost:8081;AccountKey=EmulatorKey;IgnoreEndpointCertificate=true

as opposed to searching up

https://stackoverflow.com/questions/47814786/what-is-the-username-and-password-for-the-cosmos-db-emulator or trawling through https://learn.microsoft.com/en-us/azure/cosmos-db/how-to-develop-emulator

Yes, the key might change. But if it does, thousands of CI and local test scripts around the planet need the key changed atm

And the security scanners will flag it for months/years

ealsur commented 10 months ago

And the security scanners will flag it for months/years

Can you clarify which security scanners are flagging the Emulator Key? Are these GitHub scanners?

bartelink commented 10 months ago

Nothing is flagging it because it has been a stable value since at least 2017

But I have seen both commercial and non-commercial tools flagging it (they typically use regexes)

Because having to put that key into code is bad and wrong and avoidable, but thankfully that's about to be removed

In the same way that putting in code to bypass cert trust depending on circumstances is bad and wrong and avoidable, and will hopefully meet the same fate

It should never have been normalized that there are scripts and blog posts with that key and that one thinks you put it into a .cs file, or YAML

Yes, despite the fact that the default key is logically owned/defined by the emulator, and can theoretically be changed at any time

ealsur commented 10 months ago

Because having to put that key into code is bad and wrong and avoidable, but thankfully that's about to be removed

Might be a dumb question but on which case does the Emulator Key need to be baked into code? It shouldn't be used for production and for test pipelines, it can be an environment setting because you probably don't want it in your test code either. If the key is replaced with the new special value, that would be the case too?

bartelink commented 10 months ago

For local test scripting - being able to pull down a repo and run tests offline

Obviously emulators are not the single answer to the world's problems, but people use them

When you run stuff under test, you want to arbitrarily be able to point it at the associated store for a relevant environment:

But when people do want to do a baseline run with the emulator, they do not want to (I've seen people doing all of these):

You want to have your local test script be able to just run after a git clone and spinning up the environment.

If it can be a 92 char string that lives in a script file, it's just more usable, and more people will have a simpler onramp experience, without any risk of special case code anywhere.

Having to have a long string with a well known key in it is another piece which would be nicer if it was not there

Yes, I'm over explaining why I want a 169 char string with a secret to become a 92 char string without one.

And yes, I'd like it to be even shorter, but those 77 chars and a non-secret-secret are needless friction IME.

I guess another way to handle this is for the emulator to take a connection string without an account key will the SDK let that through?

Ultimately the effect on developer experience is similar to https://github.com/jet/equinox/pull/426 / https://github.com/microsoft/mssql-docker/issues/2#issuecomment-1059819719

  1. other emulators and/or images have smooth devex
  2. MS ones have hated workarounds and a lack of joined up thinking. And people passing the buck between teams.

Yes I am not sure that a hard coded string in this SDK is good news

But the blog posts and making excuses documentation with avoidable steps is in my world and in my repo atm.

TL;DR I'm looking for the SEP field to shift inside MSFT, despite the fact that there's a plausible argument for the world to continue to bear it, in the same way that the world has yet to end because this hack and equivalents exist for SQL Server https://github.com/jet/equinox/pull/426/files#diff-e45e45baeda1c1e73482975a664062aa56f20c03dd9d64a827aba57775bed0d3R27-R33

bartelink commented 10 months ago

To counter the "well a hardcoded string in our code is just bad news" that I'd respond with if I were in your shoes:

The emulator team could also have provided a non HTTPS mode to solve the problem in the OP Why should every SDK need to have magic logic to enable a suitably hacked connection string bypass security checks that exist for good reasons? And have that baked in and requiring changing and updating versions as HTTPClient and/or the emulator changes?

I don't know the answers, but I am happy with the tradeoffs and am looking forward to deleting code from my build and test scripts.

And if the emulator team eventually affords us a better answer, I'm happy to avoid using the magic make it insecure flag too

(But my top ask from them is for me to be able to have an M2 Mac instead of the nice warm Intel one on my lap at this instant)

ealsur commented 10 months ago

@bartelink Thanks a lot for the huge insightful answer and point of view!

BTW, loved the SEP field link.

bartelink commented 10 months ago

I should go without saying that while there's scope for smoothing the path here, I accept that it's not hard to argue it's misfeature and am not pretending that this is a no brainer

Ultimately he maintenance cost is borne by you, so I can definitely see why one would argue against it given it's (arguably) Just Fine As It Is.

But the effect from this side of the fence, as a .NET Cosmos customer is the same, regardless of whether the change is in he SDK or the emulator - I (and others) can remove junk DNA from scripts and docs if something changes somewhere. on your side...