openservicebrokerapi / servicebroker

Open Service Broker API Specification
https://openservicebrokerapi.org/
Apache License 2.0
1.19k stars 436 forks source link

Differentiate secret vs. non-secret fields in binding credentials #117

Closed bmelville closed 5 years ago

bmelville commented 7 years ago

We have come across this use-case several times in various places. Given the output of a Bind() operation, we would like to be able to understand which fields within the credential are secret and must be treated so, and which are non-secret and may be treated less securely.

For example, if the output of a binding is:

credential:
  username: admin
  password: mypassword
  hostname: localhost
  port: 8080

username and password may be treated as a secret in UI or in credential handling by the controller, while hostname and port may be treated as general configuration data.

Proposal is introduced as part of the proposal for #116: https://docs.google.com/document/d/1JbsJgqgNtqthcfYwK_KbS6C8sjElrZNgoLhu40dUPAs/edit?usp=sharing

avade commented 7 years ago

We are looking at how to secure the credential object as well. I hadn't thought of the UI implications.

I think our plan is to share the design with the CF community before the holidays. I will make sure to post it here as well.

Is this issue more of a use case for #116? Do you see another section in the binding response? As a rough example:

credential:
  username: admin
  password: mypassword
configuration:
  hostname: localhost
  port: 8080
bmelville commented 7 years ago

I was thinking of it as a feature of #116. I think we will also want to be able to hint at secret input parameters as well (e.g., what if an input to binding is password and the UI should know it needs to "dot" it out when I'm typing it), so this could easily be an extension to all of our schema support.

e.g.,

plans:
  schemas:
    service_bindings:
      create:
        credential:
          properties:
            password:
              type: string
              **x-servicebroker-secret: true**
            hostname:
              type: string

Does this seem like a sane approach? I could also see benefits to separating as fields in the API, where each field has more explicit security semantics, e.g., one is encrypted or secured somehow, the other not.

mayrstefan commented 7 years ago

How about parameters which combine secret and non-secret information? Think of credentials.uri like in the postgres example of the CFdocs. Masking only parts of the uri seems quite difficult. As a result the whole value should be masked from a security point of view. As a uri doesn't have to include secrests in general this is rather irritating for users.

rcernich commented 7 years ago

Is the URI particular to the consuming application? Would it be better to provide the raw property values (e.g. host, port, user, pwd, db)? Also, the set of properties is probably consistent across an entire class of technologies (e.g. the set above is probably applicable to all DBs), which would allow the creation of one schema for an entire class of services.

mayrstefan commented 7 years ago

I would prefer secrets not being part of URIs or URLs. But I know the URI is required for some magic of the java-buildpack in Cloud Foundry

I assume the java-buildpack/Spring Boot stuff is quite popular. Fidling around with this field might cause some pain on either side. That's why I try to mention this as early as possible.

youngm commented 7 years ago

For all of our "Existing" service brokers we've found it easier to break up the different components of connection string and supply them as parameters and then re-assemble it for the users in the bind credentials supplied at bind time. In the case of an oracle service we provide a "JDBCURL" for convenience and leave the individual components for others.

rcernich commented 7 years ago

What about other technologies, e.g. node, python, etc.? Presumably, these would use different forms for configuring a datasource.

mayrstefan commented 7 years ago

A quick look into RFC 3986

Use of the format "user:password" in the userinfo field is deprecated. Applications should not render as clear text any data after the first colon (":") character found within a userinfo subcomponent unless the data after the colon is the empty string (indicating no password).

Maybe the Spring Cloud Config team can be encouraged to recognize username and password fields by default instead of relying on this information in the URI.

bmelville commented 7 years ago

This issue is discussed as part of the proposal for #116: https://docs.google.com/document/d/1JbsJgqgNtqthcfYwK_KbS6C8sjElrZNgoLhu40dUPAs/edit?usp=sharing

duglin commented 7 years ago

Just for background on this... what's the issue with treating all of it as "secret"? When would you want to expose the non-secret parts to things other than the app its bound to and who are these other actors?

Or is this more a matter of the "secret space" where this is stored might have limited capacity? And if so, what is the limit we have to keep under?

The google docs for #116 talks about being able to differentiate between the various types of data in the creds, but that's not quite the same thing - related though.

bmelville commented 7 years ago

Doug, we've encountered a concrete use-case in our implementation of the kubernetes controller, which needs to differentiate what to store as secrets vs. configmaps in the cluster.

I think what you touch on about minimizing usage of secret space is a key component, but I agree that a simplified case could easily just treat everything as secret.

I don't have anything concrete about secret limitations or real use-cases for non-secret configs other than that customers do differentiate these when storing application configs. @pmorie might be able to provide something.

duglin commented 7 years ago

Doug, we've encountered a concrete use-case in our implementation of the kubernetes controller, which needs to differentiate what to store as secrets vs. configmaps in the cluster.]

Can you share that use-case?

bmelville commented 7 years ago

Sorry, perhaps I was unclear: in kubernetes the service catalog differentiates configuration data into secrets vs. configmaps. As I said I don't have a real world use-case to back this up which demonstrates the need to distinguish in the consumption environment between secret vs. non-secret, but I'm sure they are out there, so can try to get some from others.

pmorie commented 7 years ago

Or is this more a matter of the "secret space" where this is stored might have limited capacity? And if so, what is the limit we have to keep under?

This is definitely one aspect of it. For example in Kubernetes we plan to add HSM (hardware security module) as a possible backing store for secrets. HSMs are typically both quite small and quite expensive and so it is preferable to avoid storing non-secret things in them.

Another is resource commitment attendant to providing secret data. For example, in Kubernetes, secrets can be consumed in environment variables or in volumes. The volume mode of consumption uses a tmpfs so that secret data does not come to rest on a node. As a result, secret volumes can become a non-trivial consumer of RAM on nodes. Since configuration data (ie, Kubernetes ConfigMap) does not have a requirement to avoid data-at-rest, we simply use a the node local disk to back those volumes. So, from a cluster operator standpoint, it is preferable to differentiate between secret and non-secret data.

Another facet is user experience. In user interfaces it will be desirable to display data to users that isn't secret and to avoid displaying secret data, except possibly after some confirmation that it is really okay (and possibly not even then). If we treat everything as secret for convenience we limit ourselves from being able to build user interfaces that display the desired amount of information.

pmorie commented 7 years ago

@rcernich

What about other technologies, e.g. node, python, etc.? Presumably, these would use different forms for configuring a datasource.

I am not certain exactly what this comment is in response to -- can you clarify or elaborate?

rcernich commented 7 years ago

Hey @pmorie, this comment was specific to how the information is presented to the user. If it is presented as a jdbc connection url, that might work fine for Java based consumers, but not for folks accessing the resource from other platforms/technologies/languages/etc. To me, it seemed better to provide the raw details, which could then be assembled into the appropriate configuration by the consumer.

mayrstefan commented 7 years ago

Yes, the Java world is doing really fancy things with jdbc-URLs. Some JDBC drivers even allow to pass special options to the connection parameters. A crazy example is Oracles JDBC Thin driver. Simply and expected: jdbc:oracle:thin:@host:1521:sid Syntax greetings from the native driver (tnsnames.ora entry in one line): jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=Host_name)(PORT=1521)))(CONNECT_DATA=(SID=service_name)(SERVER=DEDICATED))) We often need the later syntax to support multiple hosts of Oracle RAC databases or pass special network parameters to the driver without using the native OCI driver.

Oracle even has the possibility to lookup the db connection details via LDAP. An example from their documentation: jdbc:oracle:thin:@ldap://ldap.acme.com:7777/sales,cn=salesdept,cn=OracleContext,dc=com No database host required in this url to make the jdbc driver connect to a database.

This is all very specific to the Java world. But because some magic (Spring Boot?) depends on jdbcUrls everything has to be packed into it. The variations can hardly be constructed from host:port

avade commented 6 years ago

relates to https://github.com/openservicebrokerapi/servicebroker/issues/116

duglin commented 6 years ago

on 8/29 call we decided to defer this for now

mattmcneeney commented 5 years ago

Once #116 is merged, we will be able to use JSON schema to indicate what part of a binding response is private and shouldn't be shown to end users. Closing this here.