saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Install Salt from the Salt package repositories here:
https://docs.saltproject.io/salt/install-guide/en/latest/
Apache License 2.0
14.19k stars 5.48k forks source link

Formula dependency management #12179

Closed chrismoos closed 7 years ago

chrismoos commented 10 years ago

I think there should be a standard way to manage formula dependencies. It is very natural for a state to be composed of other states but when using third party formula there is no easy way to manage the dependencies, versions, etc,.

A popular tool for Chef is Berkshelf. You put a file in your cookbook root (or state tree root in Salt's case) like this:

source "https://api.berkshelf.com"

metadata

cookbook "mysql"
cookbook "nginx", "~> 2.6"

There is a CLI command so you can fetch and install all of the dependencies, for example.

I propose we do the same thing for Salt's formula:

Saltfile - This will contain the dependencies, sources, etc,. Saltfile.lock - This will contain all of the active/installed dependencies and their versions.

There will be a tool that will fetch and install the dependences, just like the berkshelf command.

I think that having this feature is really important especially as your state tree gets more advanced and you start pulling in third party formula.

Example Saltfile:

- sources:
    - https://formula.mycompany.com
    - https://formula.saltstack.org
- dependencies:
    nginx:
    redis:
        version: '>= 1.0.5'
    ntp:
        git: 'https://github.com/saltstack-formulas/ntp-formula.git'
pidah commented 10 years ago

+1

westurner commented 10 years ago

Questions:

Python packaging tools handle dependency graphs:

Conda packages solve for this with many languages:

ahambrick commented 10 years ago

:+1: Would be a great help in implementing a Continuous Delivery Process.

avimar commented 10 years ago

Just to mention: nodejs has a rather awesome npm system for uploading, managing, and using dependencies (from any source!). It even allows them to have their own sub-dependencies of it's own specified version so they won't conflict with other things using a different version of the dependency.

elmariofredo commented 10 years ago

1+ for simple sub dependency chain, also something like npmjs.org registry for formulas would be nice.

westurner commented 10 years ago

Formula Dependencies

In lieu of a standard way to manage this (e.g. setup.py + pip with $VIRTUAL_ENV/src[/salt-formulas] on sys.path and/or GitFS and/or salt file_roots), an informal README.rst heading for "Formula Dependencies" may be helpful.

e.g. https://github.com/bechtoldt/iscdhcp-formula/blob/master/README.rst#formula-dependencies :

Formula Dependencies
====================

None

Namespacing

It may be easier to prefix/postfix things with <github-username>. e.g.:

https://github.com/salt-formula/salt-formula
salt-formula-salt-formula

https://github.com/westurner/salt-formula
westurner-salt-formula
westurner commented 10 years ago

Python Packages

Packaging salt formulas as Python packages with setup_requires/requirements.txt dependencies:

Tools

Caveats

[EDIT]

westurner commented 10 years ago

Branching

Strategies

edword commented 10 years ago

+1 for some sort of berkshelf or npm like dep management

arnisoph commented 10 years ago

+1

skylerberg commented 10 years ago

I like @westurner's plan of using Python packages. However, we also need to be able to include the installed formulas in salt easily.

I think these could be solved with a pypifs, which would be like gitfs, but for Python packages. So instead of specifying git repos, you would have a list with entries like

  - westurner-salt-formula
  - salt-formula-apache-formula

This would handle finding the packages on your system, and would also find and include all of the dependencies based on requirements.txt.

Thus by editing your salt master's config and restarting the salt master, you could have all of your formulas and not have to worry about dependencies at all.

Finally, you should be able to specify a version just like you would when using pip manually

  - salt-formula-apache-formula==1.0.4
westurner commented 10 years ago
westurner commented 10 years ago
iggy commented 10 years ago

What about something simple like using git submodules. Maybe Salt could even have some magic added that added the top level subdirs of the submodule to the top level path structure.

i.e.

graphite-formula---+----graphite----init.sls
                   |
                   +--- nginx-formula (submodule) --- nginx --- init.sls
                   |

and the graphite and nginx dirs get added to the top level salt dir (somehow, haven't really thought too much about that yet).

skylerberg commented 10 years ago

I think git submodules have several drawbacks compared to packaging.

Git commits do not hold the same semantic meaning that package releases do. For example, if you update a package with a bugfix, then you would have to go into all of the packages that depend on it and change their submodules. With versions you do not require such a specific version, just the same major version must match (unless you need features introduced in a minor version).

Shared dependencies would be duplicated.

I think packages could be handled more elegantly: No having to include .gitmodules, no having to initialize every time you clone, etc.

iggy commented 10 years ago

.gitmodules is worse than a dependencies.txt/SaltFile/whatever.yaml/etc somehow?

And there's nothing that says you can't have a script/shell alias/whatever that does the checkout -> submodule init (in place of pip/npm/etc).

And as far as having to change .gitmodules when you commit fixes, git supports branches for submodules. So maybe each formula has a branch for each upstream release (or just master if it's a fairly generic formula).


I honestly think this is a problem that doesn't need to be solved right now.

The -formulas have enough other problems that the landscape could be complete different by the time we get around to needing real dependency management.

I'm not saying having this discussion is pointless, but I don't think implementing something right now is prudent. And I think too much discussion on the topic takes something away from the real problems that the formulas have.

There is a real problem of developer bandwidth right now. Trying to shoehorn formula dependencies in right now when nobody really knows what formulas will eventually look like is a Bad Idea™

skylerberg commented 10 years ago

I agree that inside the formula, .gitmodules is equivalent to .requirements.txt. However, I would like to see a solution where formula users do not need to have a .gitmodules and have to configure gitfs to point to the submodules. Just change the salt config, not change the salt config and have other files hanging around.

Of course, having the packages and a gitfs like way to include them is a rather large change and as you said, there are more important problems in formulas at the moment.

When we do get to solving this problem, I just want to make sure that we do it right (whatever right ends up being).

westurner commented 10 years ago

Is it possible to pull a specific version with .gitmodules, or just a branch?

How do I avoid push -f'ing over a whole tree?

westurner commented 10 years ago
jeffrey4l commented 10 years ago

I'd like to use the Python Package to manage the formulas. Just like the what python-xstatic[1] does.

E.g. There will be a packages named nginx-salt-formula which can be installed through pip or easy_install

There are several benefit for this.

  1. version manage and dependency are easy. Just change the setup.py/requirements.txt file in the formulas. Then PIP can solve the dependency.
  2. formula may depend on some Python Package in some case. ( for example nginx-salt-formula may has it own _state or _module, which ask for some Python Packages.) This can be solved by pip
  3. installation is easy. Just add the package's name to the salt master configure should be ok.

[1] https://pypi.python.org/pypi/XStatic

UtahDave commented 10 years ago

+1 for using python packages.

iggy commented 10 years ago

Currently we have a hard enough time getting people to contribute their changes back. It's also difficult getting things merged for formulas that the couple people that can commit don't understand.

I'm worried that something along the lines of full pypi packages would make that even worse.

Not to mention the fact that formulas aren't even python code...

If you require strict formula ownership, I see the number of formulas plummeting.

Again, this is as things stand now. I think things will likely be different at some future time.

whiteinge commented 10 years ago

Very interesting discussion so far. Quick note about one remark:

the couple people that can commit

Everyone on the Contributors team should have full commit access on all formulas repos. I know a few of those have slipped through the cracks. If you notice one let me know and I'll add it under the team. On a related note, I have plans to toss a web interface up (soon as work-load permits) that will allow people on the Contributors team to create repos and fork repos into the org.

iggy commented 10 years ago

I more meant that people are reticent to commit changes to formulas they don't use (unless they seem like obvious changes). Making it more difficult for people to contribute at this point in time doesn't seem prudent.

FWIW, I've personally had very good response with getting my PRs committed.

chrismoos commented 10 years ago

I really believe that having the formulas reside in a Git repository somewhere is going to be the best. I agree with @iggy that doing pypi packages just raises the barrier to contribute higher. There is plenty of evidence that the the model of forking a git repo to contribute has been very successful. It encourages people to make changes and to push them back upstream.

What's really needed is just a way to manage locating and fetching the formulas that you depend on. I don't think we have to say that We must use Git!, but instead be flexible with where formula dependencies can reside. Look at projects like CocoaPods, Bundler, and Berkshelf. They have some things in common like:

In addition, all of the aforementioned tools have been wildly successful at what they do and have really provided an easy way for people to collaborate and contribute.

CocoaPods has a central repo, kind of like Homebrew, which lists out the canonical list of all packages and metadata for each version. This gives you the ability to just specify a dependency with a simple name (and an optional version specifier). The central repository is a good one but obviously requires maintenance and people to manage pull requests of people wanting to add their packages to the offical list.

Bundler and Berkshelf also have central listings of packages, albeit a bit different than CocoaPods.

I propose the following high level idea:

Obviously there is a lot to spec out, but my 2 cents is that the above is the way to go, not pypi.

TheCatPlusPlus commented 10 years ago

setuptools along with pip/peep already allows for everything listed above, Salt really doesn't have to reinvent the wheel (heh). And you don't necessarily have to make people upload anything to PyPI: just make a custom index that generates package entries from the currently existing GitHub organisation.

iggy commented 10 years ago

An instance came up just yesterday with dependencies in a formula that I help maintain. To my knowledge it's the first formula to specifically list any dependencies. This particular instance is the aptly formula and it says it depends on the nginx formula.

The thing is, it doesn't depend on the nginx formula. It depends on a salt state module called nginx (we have our own nginx state module rather than using the formula).

So if this had been codified already and the aptly formula had a hard dependency on the nginx-formula, we wouldn't have used it (or more likely I would have cloned the code, gutted the nginx dependency, fixed all the issues it had, and never bothered to contribute my fixes back upstream).

Just a use-case to keep in mind.

P.S. I'm still not sold on pypi being open to a bunch of packages that are going to contain virtually no python code. All over the pypi site it specifically says it's for python packages, modules, and apps. Not a bunch of yaml code with some jinja sprinkled in.

westurner commented 10 years ago

Python Packages

I put these notes together about python packages, in general: https://westurner.github.io/dotfiles/tools.html#python-packages

Examples of including package_data in Python packages:

You can generate MANIFEST.in from the repository manifest:

git ls-files | sed 's/\(.*\)/include \1/g' > MANIFEST.in

You can add commands to setup.py:

# setup.py
from distutils.command.build import build as DistutilsBuildCommand

def generate_manifest_in_from_hg():
    """Generate MANIFEST.in from 'hg manifest'"""
    print("generating MANIFEST.in from 'hg manifest'")
    cmd = r'''hg manifest | sed 's/\(.*\)/include \1/g' > MANIFEST.in'''
    return subprocess.call(cmd, shell=True)

def generate_manifest_in_from_git():
    """Generate MANIFEST.in from 'git ls-files'"""
    cmd = r'''git ls-files | sed 's/\(.*\)/include \1/g' > MANIFEST.in'''
    return subprocess.call(cmd, shell=True)

class RunCommand(setuptools.Command):
    user_options = []
    description = "<TODO>"

    def initialize_options(self):
        pass

    def finalize_options(self):
        pass

    def run(self):
        print(self.__class__.__name__)

class GitManifestCommand(RunCommand):
    """Generate MANIFEST.in from $(git ls-files)"""
    description = __doc__

    def run(self):
        generate_manifest_in_from_git()

# ...
class DotfilesBuildCommand(DistutilsBuildCommand):
    """re-generate MANIFEST.in and build"""
    description = (
        "update MANIFEST.in AND " + DistutilsBuildCommand.description)

    def run(self):
        generate_manifest_in_from_git()
        DistutilsBuildCommand.run(self)

# setup(
    cmdclass={
        'git_manifest': GitManifestCommand,
        'build': DotfilesBuildCommand,
    }
# )

And then

python setup.py git_manifest
python setup.py build   # calls generate_manifest_in_from_git() before building

Hosting salt formula python packages

How is git insufficient?

Version Strings

Testing

Should there be a requirement that each formula can be tested with a standard interface and/or convention?

Something like ./tests/__init__.py in each formula?

Or should that just be functionality provided in salt core?

Documentation in the README?

westurner commented 10 years ago

For merging each salt-formula into one major (git) repository (as I think @chrismoos is describing):

I don't know how to do this with hg; though I'm sure there's a way. The immutability of hg has always been a selling point for me.

jeffrey4l commented 9 years ago

There is another big issue in the salt-formula repository. There is few version management in current's formulas. It is useless and dangerous for production environment. Because formula may be changed and cause some issue if there is only one master branch.

I think this is a big issue which will block the re-use of formulas.

samos123 commented 9 years ago

+1 for using Python packages

arnisoph commented 9 years ago

I'm thinking about extending https://github.com/bechtoldt/vcs-gather with dependency resolution support for SaltStack formulas and Puppet modules. The metadata.json file from Puppet (https://docs.puppetlabs.com/puppet/latest/reference/modules_publishing.html#write-a-metadatajson-file) could be acceptable for it.

westurner commented 9 years ago

@bechtoldt https://github.com/westurner/pyrpo (pyrpo -s . -r sh) and/or pypi:vcs and/or https://github.com/conda/conda/tree/master/conda (http://conda.pydata.org/docs/#requirements (pycosat) may be useful).

Conda packages have a meta.yaml file. https://github.com/conda/conda-recipes/blob/master/requests/meta.yaml

Python packages have a pydist.json (PEP 426)

DanyC97 commented 9 years ago

very useful info, is any traction being put on this for next Salt release? Asking as i'm at the point where i want to move from states (where i have parent-child/ inheritance relationship) to formula based but then seeing this topic i'm worried i'll bum into a bigger problem.

westurner commented 9 years ago

Salt Formulas work great without automated dependency resolution (formula dependency management).

Here's one way to do Salt Formulas in separate repos + GItFS:

* https://github.com/saltstack-formulas/salt-formula/blob/master/salt/formulas.sls On Jun 29, 2015 6:52 AM, "Dani Comnea" notifications@github.com wrote:

very useful info, is any traction being put on this for next Salt release? Asking as i'm at the point where i want to move from states (where i have parent-child/ inheritance relationship) to formula based but then seeing this topic i'm worried i'll bum into a bigger problem.

— Reply to this email directly or view it on GitHub https://github.com/saltstack/salt/issues/12179#issuecomment-116626660.

arnisoph commented 9 years ago

I'm going to implement https://github.com/bechtoldt/GatherGit/issues/3 in a few weeks which will address the ideas of this issue. If you have any further comments, let me know.

westurner commented 9 years ago

That would be cool.

For test cases, you might have a look at some of the:

And a start at a test framework for salt formulas:

arnisoph commented 9 years ago

@westurner salt formula testing is a completely different topic, I'll cover that in https://github.com/bechtoldt/formula-docs/issues/4 :)

westurner commented 9 years ago

@bechtoldt Some tests are probably apropriate? (e.g. 'compiles' w/o syntax error, [...])

Should this/these metadata/test skeletons be standard functionality of e.g. salt.formulas or copied into every formula?

An example metadata file in https://github.com/westurner/cookiecutter-saltformula could be helpful.

arnisoph commented 9 years ago

ouh, you mean testing the metadata itself? of course, this will be important.

westurner commented 9 years ago

where/how do I call e.g. check_formula_metadata('./path'), check_formula_'importable'('name')?

On Tue, Aug 11, 2015 at 4:31 PM, Arnold Bechtoldt notifications@github.com wrote:

ouh, you mean testing the metadata itself? of course, this will be important.

— Reply to this email directly or view it on GitHub https://github.com/saltstack/salt/issues/12179#issuecomment-130084160.

arnisoph commented 9 years ago

:+1:

arnisoph commented 9 years ago

The Salt Package Manager might be a solution for this issue in the future. I think it's still in a very early state. I'll file some feature requests.. :)

24896 (PR)

25210

25211

https://docs.saltstack.com/en/develop/topics/spm/

rallytime commented 9 years ago

Good call @bechtoldt. Any addition thoughts about this ^^ @techhat?

j1n6 commented 8 years ago

Here's my thought. The reason we need dependency management is being able to 1) Easily reproduce formular used in a code base 2) Explicitly understand it's original reference location 3) Being able to compare or track down changes for formulars 4) Being able to collaborate in both formular module development and large complex formular deployment

There are many ways to implement this, the simplest way is to use git to begin with - like Golang community's Godep. There's a great advantage of this:

j1n6 commented 8 years ago

btw, spm looks great.

themalkolm commented 8 years ago

Are there any plans to have dependencies in spm?

aphor commented 7 years ago

This ticket has gone quiet, but it's still open so I'm unclear whether there's lack of consensus, or whether there's consensus that SPM has made the problem go away.

Can anyone comment on successes/failures using SPM to manage formula interdependencies?

techhat commented 7 years ago

SPM does have dependency management. An example formula might look like:

name: apache
os: RedHat, Debian, Ubuntu, Suse, FreeBSD
os_family: RedHat, Debian, Suse, FreeBSD
version: 201704
release: 1
summary: Formula for installing Apache
description: Formula for installing Apache web server
dependencies: zlib,pcre
optional: mod_perl
recommended: mod_ssl

If used, the dependencies field contains a list of SPM packages that must be installed before this one is, and SPM will attempt to install them at the same time. optional and recommended are currently informational only; no enforcement currently exists.

SPM has had this since last year, so I'm going to go ahead and close this. Of course, comments are still welcome, but if there are issues with dependencies, I would rather they go in a fresh ticket.

OrangeDog commented 4 years ago

Surely the formula should simply include states to manage its own dependencies?