Open michael2m opened 3 years ago
Hi @michael2m,
Thank you for filing this enhancement request for Secretless Broker. I'd like to understand your suggestion better.
Can you tell me which credentials provider that you're using with the Secretless Broker? If it's the Hashicorp Vault credentials provider, then that provider should support multiple fields within a secret. For example, you should be able to include a section in the secretless.yml
configuration file that looks similar to the following:
username:
from: vault
get: postgres/creds#username
password:
from: vault
get: postgres/creds#password
Would this be sufficient for supporting multi-value dynamic credentials, or is there more required to support what you're looking for?
@diverdane this is not sufficient. In Vault when a dynamic secret, e.g. for Postgres, is obtained you get both username/role and password. Every call generates a new username and password combination. The above would effectively create the dynamic credentials for Postgres twice, once in the username's get and once in the password's get. Yet it would take only the username from the first dynamic credentials and the password from the second. They would not form a matching pair.
In other words, every get
would trigger a newly created set of credentials. My request / problem is that I would like to update multiple values (e.g. both username and password) through a single get
.
@michael2m thanks for the explanation, I understand your point! This may require a change in the secrets.yml syntax... using the example above, maybe something like this:
creds:
from: vault
get: postgres/creds
fields:
- secret: username
field: username
- secret: password
field: password
or a little more explicit?:
creds:
from: vault
get: postgres/creds
fields:
- secret: username
field: creds#username
- secret: password
field: creds#password
An alternative would be leaving the syntax the same, and locally (and implicitly) caching the value of the complex secret (e.g. postgres/creds
) when the secrets.yml is parsed and secrets are retrieved. We may want to do this anyway (even if we're chaning syntax as described above) in order to maintain backwards compatibility.
Any suggestions/improvements on the syntax?
this is an interesting use case, and I think we may want to reconsider how we resolve secret values in Secretless. there can be value in sending each provider in a service definition the set of variables it will be expected to resolve for the connection, rather than sending each credential key one by one.
that is, I am proposing we consider revising the resolver definition: https://github.com/cyberark/secretless-broker/blob/b3c42e3c534c15f9110e08c47ee7785da9597c4d/internal/plugin/resolver.go#L93-L102
so that instead of getting each value one by one, we implement some logic to group the credentials for each service by provider and send the provider the set of keys it will need to resolve for the connection.
for the vault
provider, this would mean getting the reference to the username/password in the same request, so that they can be dynamically retrieved and returned as a pair. for the conjur
provider, this would mean being able to use batch retrieval to retrieve a set of secret values at once, and limit the number of required requests (and potentially eventually retrieving a safe
or other comparable policy object that groups together connected variables)
that probably makes sense in a separate issue from this one, and this issue would depend on that issue's resolution. I'd love some feedback from @cyberark/community-and-integrations-team and @jonahx before moving forward with this idea, including on its feasibility - there may be some aspects of the code that I'm missing that make this more difficult than I think it is.
Completely agree with @izgeri on this. We just need to change the provider's method definition from:
// GetValue takes in an id of a variable and returns its resolved value
GetValue(id string) ([]byte, error)
To:
// GetValues takes in variable ids and returns their resolved values
GetValues(ids ...string) ([][]byte, error)
Everything from before stays more or less the same while giving providers flexibility to implement batch retrieval mechanisms.
I thought this was straightforward enough that I created a draft PR that implements/captures the idea and is green ✅ , see https://github.com/cyberark/secretless-broker/pull/1344/.
@doodlesbykumbi have you considered:
// GetValues takes in variable ids and returns their resolved values
GetValues(ids ...string) (map[string][]byte, error)
using a map rather than a slice of values ?
@michael2m That's a great point and interestingly enough, we have discussed this yesterday internally and I think we were leaning towards a richer return value. I think ideally the response instead of being maybe a simple map would instead be a rich type for it to be able to show individual retrieval problems since that will get lost in both original and your proposal.
So maybe something like::
type ProviderResponse struct {
ID string
Value []byte
Error error
}
...
GetValues(ids ...string) ([]ProviderResponse, error)
I don't think we've reached a clear direction though so all input is welcome!
Totally agree, richer type is definitely the way to go forward. Just wondering about the following. Suppose I take the Vault database secret engine e.g. for Postgres as an example (see Vault - PostgreSQL database). To get the dynamic username and password, I have to get the secret stored in Vault at e.g. database/creds/my-role
and in response from Vault get an object:
{
"username": "some-random-username"
"password": "some-random-password"
...
}
In terms of Secretless, I have 1 ID (namely the path to the secret in Vault) and 2 values (namely username and password). The above mentioned suggested solutions focus on batching IDs. However I am concerned about the case where a single ID gets multiple related values. I can't quite see how to resolve ... I think the suggested syntax/config of @diverdane more closely matches my point.
I'll update https://github.com/cyberark/secretless-broker/pull/1344 to use the rich type.
However I am concerned about the case where a single ID gets multiple related values.
Good point. I like the fields syntax that @diverdane suggested.
For added flexibility I wonder if we might entertain separating secret and credentials definition.
secrets:
dbEndpoint:
from: vault
get: /path/to/endpoint
dbUserCreds:
from: vault
get: /path/to/specific/user/creds
services:
ecomm-db:
connector: mongodb
listenOn: tcp://0.0.0.0:6175
credentials:
connString: mongodb+srv://{{ .dbUserCreds.username}}:{{ .dbUserCreds.password }}@{{ .dbEndpoint }}
authURL: http://{{ .dbEndpoint }}/xyz
Bit of a contrived example but I think this approach might solve a few issues.
@doodlesbykumbi although complexity increases (implementation), this is clearly an improvement and better design. It is a bit more challenging to get TTL right (item 2). Currently a secret is fetched and the provider can immediately read fresh ones from the source (e.g. in Vault this could mean getting the new dynamic credentials; OTOH for env/files it doesn't matter). I think TTL handling must be delegated to the provider in that case (in Vault example the TTL is obtained along with the credentials); it would act kind of like a cache.
type ProviderResponse struct {
ID string
Value []byte
Error error
}
...
GetValues(ids ...string) ([]ProviderResponse, error)
It looks like the rich type might have some issues.
map[string]ProviderResponse
, where the key is ID and we get rid of ID on ProviderResponse. It is a bit more challenging to get TTL right (item 2). Currently a secret is fetched and the provider can immediately read fresh ones from the source (e.g. in Vault this could mean getting the new dynamic credentials; OTOH for env/files it doesn't matter). I think TTL handling must be delegated to the provider in that case (in Vault example the TTL is obtained along with the credentials); it would act kind of like a cache.
@michael2m Agreed. There can be global and provider-specific smarts for secret fetching. With TTL, I thought there could be a global smart that dictates how often we fetch new secrets. Perhaps, this might not have been the best example of smarts. Caching seems tricky because it might break something fundamental in Secretless... not keeping secrets/creds in memory longer than the auth handshake.
What should the semantic of this situation be ? Fetch each instance separately or should we group responses on id...
This is behavior that should be mostly provider-specific.
When mapping back from id to credential name, without additional information via the input args or relying on order we are unable to disambiguate a single id that maps to multiple secret names.
From the interface point of view, we don't (and shouldn't) care how the passed in variables are turned to secrets as long as the output is returned with expected entries. I think if we're digging into how it should internally work, deduplication is fine since we can expect the callee to just break on first matching entry. This however does make @michael2m's suggestion of returning main type to be a map that you also mention in #2
maybe a bit more appropriate (I'd still prefer a rich type as the value though) since it would provide an un-ambiguous treatment of results.
GetValues can return errors on each ProviderResponse and also this other error. I'm not sure what the other error is for. @sgnn7 What did you have in mind ?
ProviderResponse
error should record errors in retrieval of an individual valueCaching seems tricky because it might break something fundamental in Secretless... not keeping secrets/creds in memory longer than the auth handshake.
Agreed - this is fundamentally against of how we want the product to work at this time. Secretless should not keep anything in memory more than strictly required for a connection to complete. At some future point we might have capabilities to secure/encrypt broker's memory space well enough for it but we don't have adequate security guards around that right now to be safe when caching.
Putting complexities of TTL aside. Can somehow the suggested configuration of @doodlesbykumbi be supported ? Or minimally the proposed configuration of @diverdane ? That would be a substantial improvement for Secretless and solve the issue of getting multiple related values together (rather than merely batching).
@sgnn7
This is behavior that should be mostly provider-specific.
Giving that flexibility depends on the return collection type, especially if we're keeping the input as ids []string
. If we return a map of ids to responses then it means we are effectively grouping the responses for any given id, 2 cred names with the same secret id always get the same value. If we use a slice and maintain the order then it means each provider can decide on the behavior (grouping or unique), which gives us the flexibility we want and ensures that responses for ids can be mapped to credential names without additional coordination.
With the above in mind then we want to go with the slice. Alternatives do become available when we include both ids and credential names in the input. We can then return a map of credential names with rich responses as values. Maybe that's the best way forward!
If we return a map of ids to responses then it means we are effectively grouping the responses for any given id
Correct and I think this is exactly what we want to both remove ambiguity about the results and because the use case where a connector needs two different dynamic secrets from the same ID seems implausible. If a provider gets two of the same variable IDs in the same batch request, it's coming from a single connection attempt and I think the intention should be that we meld them into a single provider request and a single response value from the function, making the map much more favorable of a return type to me. We also get the code benefit that the caller can be much more lean on logic as it no longer needs to explicitly track the order of its inputs.
@michael2m
Putting complexities of TTL aside. Can somehow the suggested configuration of @doodlesbykumbi be supported ? Or minimally the proposed configuration of @diverdane ? That would be a substantial improvement for Secretless and solve the issue of getting multiple related values together (rather than merely batching)
I think we want to hold off on any changes to the Secretless-wide config until we have a better idea of our priorities for the project.
I think for the moment we can support multi-value (dynamic) credentials with a suffix e.g. path/to/secret/prefix#path/to/value/suffix
. I'm not certain if #
is always the best delimiter, maybe it's something that we'd want to be configurable, maybe through envvars.
credentials:
username:
from: vault
get: postgres/creds#username
password:
from: vault
get: postgres/creds#password
This would work with the batch retrieval implementation in #1344. The vault provider would just need to group ids with the same path/to/secret/prefix
prefix and use the singular value of the secret to resolve the values of each path/to/value/suffix
.
@sgnn7 I agree so I've gone ahead and implemented the map.
Interesting, @doodlesbykumbi. Looking forward to #1344. In #1331 I opted for the /path/to/secret#navigate.to.property
syntax (required for another reason). I am a little biased by Vault, which returns JSON object for secrets, hence object property traversal is needed. But the part from #
onwards is provider specific, I presume. So is grouping by prefix for batching.
@michael2m Just a heads up that https://github.com/cyberark/secretless-broker/pull/1344 has landed as https://github.com/cyberark/secretless-broker/pull/1356.
Published in CyberArk Aha! idea portal
Problem
Perceiving secrets as just single values, like passwords or API keys, is too limited. Often credentials com in pairs or multiple related values in general. This holds e.g. for AWS (access key and secret key), Azure (client id and client secret), Postgres roles (username/role and password). Those related values may be generated by a provider. This makes credentials fully dynamic. They may change upon every new request, e.g. every new connection to Postgres may use a fresh set of credentials (with certain time to live). Unfortunately it causes trouble when a provider is used to first get e.g. a username and then a password, because that may generate two sets of credentials of which the username comes from the first and the password from the latter, yet together form no valid pair.
Solution
Suggestions ?
Alternatives
This is not an issue for simple providers like literal, environment and file. Nor will it be an issue if only 1 among related values is dynamic (e.g. only password). No alternatives seem to be supported or easily implemented.
Additional context
Dynamic credentials e.g. in Vault appear in: