Open mattsb42 opened 6 years ago
I've been thinking about the credential manager interface and separation of passive sources, and I think I have a better model. Working on sketching it out in my fork now.
user input -> ENV -> pypirc -> keyring -> request user input
user input -> ENV -> pypirc -> [credential manager]
My goal is to remove the special case of the environment variables and config file credential sourcing, resulting in:
user input -> [credential manager]
Conceptually, this pattern would result in:
DeferredCredentials[Credentials, CredentialManager[CredentialManager[...CredentialManager]]]
With the default credential manager defining its own internal credential manager that would look like:
EnvironmentVariableCredentialManager(
credential_manager=ConfigFileCredentialManager(
config_file='~/.pypirc',
credential_manager=KeyringCredentialManager(
credential_manager=UserInputCredentialManager()
)
)
)
See also #361
@sigmavirus24 Yup, I saw that. It was actually what started me thinking down the path of a composable hierarchy of credential managers. Defining a chain like this will be complex from the CLI, but simple if using programmatically via an API.
@mattsb42 I like your proposal. I'm thinking that if we're going to provide an interface for plugin developers then we probably need a simpler structure for dynamically aggregating these managers. As such, I've come up with an alternative proposal.
I'm wondering what you think of using a Pipeline
object that composes the Manager
objects. In your example, this might look like:
username_pipeline = Pipeline([
EnvironmentVariableManager('TWINE_USERNAME'),
ConfigFileManager(config=parsed_config, ...),
KeyringManager('username'),
UserInputManager(config=parsed_config, 'username'),
])
username_pipeline.append(SomeOtherManager(...))
username = username_pipeline.retrieve()
I think we could use these Manager
s for most of the configuration options too. We should start with credentials and maybe extend outwards based on initial feedback from these.
Would you be willing to write this up and send it as a design doc? (I think twine will start needing to aggregate separate design docs like #361 and this one but I'm not sure how we want to do that yet. :smile: )
I think that is a sensible approach, and one I had considered. It's largely just a matter of where we put the selection logic. If we're looking at a more general pattern for value managers, I agree, placing the selection logic in a separate component probably makes more sense.
Yeah, I can write this up as a design doc. I'll take a closer look through #361 and model the doc structure on that. Might not be until next week though: PyCon sprints ended for me today, so back to other stuff fighting for priority.
Yeah, I can write this up as a design doc. I'll take a closer look through #361 and model the doc structure on that. Might not be until next week though: PyCon sprints ended for me today, so back to other stuff fighting for priority.
Thank you! It doesn't need to be exactly like #361 I was just trying to make something to describe what I was thinking of doing. Also, there's absolutely no rush. I appreciate you being willing to write this up at all. Your thoughtful consideration of this whole topic is sincerely appreciated. It's a lot more thought than some of us have put into this topic.
FWIW it kinda looks like https://getconf.readthedocs.io/en/latest/reference.html#getconf.BaseConfigGetter where you can define a list of "finders" to load configuration/secrets (from env, ini files, python dictionnary, etc).
(disclaimer: I'm one of the employee maintaining this software)
From a skim, that looks like what I authored almost 2 years ago for Flake8. Nothing is original. But I would appreciate an example of how your company's project would work in the same way @mattsb42 described
Yes I'm sure dozens of projects have implemented the same kind of logic :)
Using getconf
, the code would look like:
import getconf
import getconf.finders
CONFIG = getconf.BaseConfigGetter(
#getconf.finders.SysArgFinder(), # This one does not exist
getconf.finders.NamespacedEnvFinder('twine'),
getconf.finders.MultiINIFilesParserFinder(['~/twine.ini', '/etc/twine/twine.ini']),
getconf.finders.SectionDictFinder({'DEFAULT': {'pypi_url': 'https://pypi.org'}}),
)
print(CONFIG.getstr('username'))
print(CONFIG.getint('verbose', default=0))
CONFIG.getstr('username', default='nobody')
will then look in order:
sys.argv
for an --username
option (again the SysArgFinder
currently does not exist)os.env
for a TWINE_USERNAME
env var~/twine.ini
then /etc/twine/twine.ini
files for a DEFAULT.username
value'nobody'
otherwise.Initially, the project was developed to only deal with .ini files and env vars hence the tendency to expect keys to follow a (non-mandatory)section.key
pattern.
A nice thing is that it is quite simple and it is easy for a plugin to query a new key twine.CONFIG.getlist('some.new_option')
but one major weakness is that it requires rigor to keep the "discoverability" of options (like would be required by a possible SysArgFinder
).
The pip project has a similar feature request: https://github.com/pypa/pip/issues/4475
Problem
Currently, twine supports obtaining credentials (username, password, and cert locations) from four sources: environment variables, the
.pypirc
config file, values passed in from the command line, and (specifically for the password) keyring.These sources provide reasonable options if you are using twine interactively as a human from a trusted computer, but more and more package owners want to push to PyPI (or repository of their choice) automatically as part of their CI pipeline.
I think that a good way to securely enable this usage pattern is to define a pluggable framework for credential managers, similar to how commands are currently managed. Not only will this make things simpler for integration into CI systems, it will also enable the creation of plugins to source credentials from secure credential storage systems such as LastPass, OnePassword, Signet, Amazon Secrets Manager, Hashicorp Vault, HSMs, etc.
PoC Contents
I have put together a proof of concept for what this could look like here[1]. This includes:
Specific areas that need discussion
What should the credential manager interface look like?
The current interface is based on the inputs to the current credential loaders. I think we can probably simplify this.
Passive credential sources.
Similar to how
KeyringCredentialManager
will specifically source the password only from keyring, ignoring all other sources, I would like to move the other existing passive sources (environment variables and.pypirc
configuration) into credential managers that are opportunistically used byDefaultCredentialManager
but that can be explicitly avoided if so desired.Certificates
Currently, certificates are passed through to requests by identifying a local file that contains the certificates in question. I would like to change this to have the credential manager provide the actual certificate data directly. This will enable the use of secure certificate storage systems and avoid having to place sensitive information on the local filesystem.
[1] https://github.com/pypa/twine/compare/master...mattsb42:pluggable-creds [2] https://github.com/mattsb42/twine/blob/pluggable-creds/twine/credential_managers/__init__.py#L52 [3] https://github.com/mattsb42/twine/blob/pluggable-creds/twine/credential_managers/user.py [4] https://github.com/mattsb42/twine/blob/pluggable-creds/twine/credential_managers/keyring.py [5] https://github.com/mattsb42/twine/blob/pluggable-creds/twine/credential_managers/default.py [6] https://github.com/mattsb42/twine/blob/pluggable-creds/twine/credential_managers/__init__.py#L29-L33 [7] https://github.com/mattsb42/twine/blob/pluggable-creds/twine/commands/upload.py#L230-L242 [8] https://github.com/mattsb42/twine/blob/pluggable-creds/twine/commands/upload.py#L119-L145