Closed chrismoos closed 7 years ago
+1
Questions:
Python packaging tools handle dependency graphs:
Conda packages solve for this with many languages:
:+1: Would be a great help in implementing a Continuous Delivery Process.
Just to mention: nodejs has a rather awesome npm system for uploading, managing, and using dependencies (from any source!). It even allows them to have their own sub-dependencies of it's own specified version so they won't conflict with other things using a different version of the dependency.
1+ for simple sub dependency chain, also something like npmjs.org registry for formulas would be nice.
Formula Dependencies
In lieu of a standard way to manage this (e.g. setup.py + pip with $VIRTUAL_ENV/src[/salt-formulas]
on sys.path and/or GitFS and/or salt file_roots
),
an informal README.rst heading for "Formula Dependencies" may be helpful.
e.g. https://github.com/bechtoldt/iscdhcp-formula/blob/master/README.rst#formula-dependencies :
Formula Dependencies
====================
None
Namespacing
It may be easier to prefix/postfix things with <github-username>
. e.g.:
https://github.com/salt-formula/salt-formula
salt-formula-salt-formula
https://github.com/westurner/salt-formula
westurner-salt-formula
Python Packages
Packaging salt formulas as Python packages with setup_requires/requirements.txt dependencies:
Tools
Caveats
[EDIT]
Branching
Strategies
trunk
/ master
/ default
master
, develop
, feature-*
, release-*
(tag), hotfix-
+1 for some sort of berkshelf or npm like dep management
+1
I like @westurner's plan of using Python packages. However, we also need to be able to include the installed formulas in salt easily.
I think these could be solved with a pypifs
, which would be like gitfs
, but for Python packages. So instead of specifying git repos, you would have a list with entries like
- westurner-salt-formula
- salt-formula-apache-formula
This would handle finding the packages on your system, and would also find and include all of the dependencies based on requirements.txt.
Thus by editing your salt master's config and restarting the salt master, you could have all of your formulas and not have to worry about dependencies at all.
Finally, you should be able to specify a version just like you would when using pip manually
- salt-formula-apache-formula==1.0.4
What about something simple like using git submodules. Maybe Salt could even have some magic added that added the top level subdirs of the submodule to the top level path structure.
i.e.
graphite-formula---+----graphite----init.sls
|
+--- nginx-formula (submodule) --- nginx --- init.sls
|
and the graphite and nginx dirs get added to the top level salt dir (somehow, haven't really thought too much about that yet).
I think git submodules have several drawbacks compared to packaging.
Git commits do not hold the same semantic meaning that package releases do. For example, if you update a package with a bugfix, then you would have to go into all of the packages that depend on it and change their submodules. With versions you do not require such a specific version, just the same major version must match (unless you need features introduced in a minor version).
Shared dependencies would be duplicated.
I think packages could be handled more elegantly: No having to include .gitmodules, no having to initialize every time you clone, etc.
.gitmodules is worse than a dependencies.txt/SaltFile/whatever.yaml/etc somehow?
And there's nothing that says you can't have a script/shell alias/whatever that does the checkout -> submodule init (in place of pip/npm/etc).
And as far as having to change .gitmodules when you commit fixes, git supports branches for submodules. So maybe each formula has a branch for each upstream release (or just master if it's a fairly generic formula).
I honestly think this is a problem that doesn't need to be solved right now.
The -formulas have enough other problems that the landscape could be complete different by the time we get around to needing real dependency management.
I'm not saying having this discussion is pointless, but I don't think implementing something right now is prudent. And I think too much discussion on the topic takes something away from the real problems that the formulas have.
There is a real problem of developer bandwidth right now. Trying to shoehorn formula dependencies in right now when nobody really knows what formulas will eventually look like is a Bad Idea™
I agree that inside the formula, .gitmodules is equivalent to .requirements.txt. However, I would like to see a solution where formula users do not need to have a .gitmodules and have to configure gitfs to point to the submodules. Just change the salt config, not change the salt config and have other files hanging around.
Of course, having the packages and a gitfs like way to include them is a rather large change and as you said, there are more important problems in formulas at the moment.
When we do get to solving this problem, I just want to make sure that we do it right (whatever right ends up being).
Is it possible to pull a specific version with .gitmodules
, or just a branch?
How do I avoid push -f
'ing over a whole tree?
I'd like to use the Python Package to manage the formulas. Just like the what python-xstatic[1] does.
E.g. There will be a packages named nginx-salt-formula which can be installed through pip or easy_install
There are several benefit for this.
+1 for using python packages.
Currently we have a hard enough time getting people to contribute their changes back. It's also difficult getting things merged for formulas that the couple people that can commit don't understand.
I'm worried that something along the lines of full pypi packages would make that even worse.
Not to mention the fact that formulas aren't even python code...
If you require strict formula ownership, I see the number of formulas plummeting.
Again, this is as things stand now. I think things will likely be different at some future time.
Very interesting discussion so far. Quick note about one remark:
the couple people that can commit
Everyone on the Contributors team should have full commit access on all formulas repos. I know a few of those have slipped through the cracks. If you notice one let me know and I'll add it under the team. On a related note, I have plans to toss a web interface up (soon as work-load permits) that will allow people on the Contributors team to create repos and fork repos into the org.
I more meant that people are reticent to commit changes to formulas they don't use (unless they seem like obvious changes). Making it more difficult for people to contribute at this point in time doesn't seem prudent.
FWIW, I've personally had very good response with getting my PRs committed.
I really believe that having the formulas reside in a Git repository somewhere is going to be the best. I agree with @iggy that doing pypi packages just raises the barrier to contribute higher. There is plenty of evidence that the the model of forking a git repo to contribute has been very successful. It encourages people to make changes and to push them back upstream.
What's really needed is just a way to manage locating and fetching the formulas that you depend on. I don't think we have to say that We must use Git!, but instead be flexible with where formula dependencies can reside. Look at projects like CocoaPods, Bundler, and Berkshelf. They have some things in common like:
In addition, all of the aforementioned tools have been wildly successful at what they do and have really provided an easy way for people to collaborate and contribute.
CocoaPods has a central repo, kind of like Homebrew, which lists out the canonical list of all packages and metadata for each version. This gives you the ability to just specify a dependency with a simple name (and an optional version specifier). The central repository is a good one but obviously requires maintenance and people to manage pull requests of people wanting to add their packages to the offical list.
Bundler and Berkshelf also have central listings of packages, albeit a bit different than CocoaPods.
I propose the following high level idea:
Obviously there is a lot to spec out, but my 2 cents is that the above is the way to go, not pypi.
setuptools along with pip/peep already allows for everything listed above, Salt really doesn't have to reinvent the wheel (heh). And you don't necessarily have to make people upload anything to PyPI: just make a custom index that generates package entries from the currently existing GitHub organisation.
An instance came up just yesterday with dependencies in a formula that I help maintain. To my knowledge it's the first formula to specifically list any dependencies. This particular instance is the aptly formula and it says it depends on the nginx formula.
The thing is, it doesn't depend on the nginx formula. It depends on a salt state module called nginx (we have our own nginx state module rather than using the formula).
So if this had been codified already and the aptly formula had a hard dependency on the nginx-formula, we wouldn't have used it (or more likely I would have cloned the code, gutted the nginx dependency, fixed all the issues it had, and never bothered to contribute my fixes back upstream).
Just a use-case to keep in mind.
P.S. I'm still not sold on pypi being open to a bunch of packages that are going to contain virtually no python code. All over the pypi site it specifically says it's for python packages, modules, and apps. Not a bunch of yaml code with some jinja sprinkled in.
Python Packages
I put these notes together about python packages, in general: https://westurner.github.io/dotfiles/tools.html#python-packages
Examples of including package_data
in Python packages:
python setup.py sdist
(setuptools) includes files listed in
MANIFEST.in
You can generate MANIFEST.in
from the repository manifest:
git ls-files | sed 's/\(.*\)/include \1/g' > MANIFEST.in
You can add commands to setup.py:
# setup.py
from distutils.command.build import build as DistutilsBuildCommand
def generate_manifest_in_from_hg():
"""Generate MANIFEST.in from 'hg manifest'"""
print("generating MANIFEST.in from 'hg manifest'")
cmd = r'''hg manifest | sed 's/\(.*\)/include \1/g' > MANIFEST.in'''
return subprocess.call(cmd, shell=True)
def generate_manifest_in_from_git():
"""Generate MANIFEST.in from 'git ls-files'"""
cmd = r'''git ls-files | sed 's/\(.*\)/include \1/g' > MANIFEST.in'''
return subprocess.call(cmd, shell=True)
class RunCommand(setuptools.Command):
user_options = []
description = "<TODO>"
def initialize_options(self):
pass
def finalize_options(self):
pass
def run(self):
print(self.__class__.__name__)
class GitManifestCommand(RunCommand):
"""Generate MANIFEST.in from $(git ls-files)"""
description = __doc__
def run(self):
generate_manifest_in_from_git()
# ...
class DotfilesBuildCommand(DistutilsBuildCommand):
"""re-generate MANIFEST.in and build"""
description = (
"update MANIFEST.in AND " + DistutilsBuildCommand.description)
def run(self):
generate_manifest_in_from_git()
DistutilsBuildCommand.run(self)
# setup(
cmdclass={
'git_manifest': GitManifestCommand,
'build': DotfilesBuildCommand,
}
# )
And then
python setup.py git_manifest
python setup.py build # calls generate_manifest_in_from_git() before building
Hosting salt formula python packages
How is git insufficient?
Version Strings
major.minor.patch-<gitcommitid>
, in order to make
diff
-ing easy.Testing
Should there be a requirement that each formula can be tested with a standard interface and/or convention?
Something like ./tests/__init__.py
in each formula?
Or should that just be functionality provided in salt core?
Documentation in the README
?
For merging each salt-formula into one major (git) repository (as I think @chrismoos is describing):
I don't know how to do this with hg
; though I'm sure there's a way. The immutability of hg
has always been a selling point for me.
There is another big issue in the salt-formula repository. There is few version management in current's formulas. It is useless and dangerous for production environment. Because formula may be changed and cause some issue if there is only one master branch.
I think this is a big issue which will block the re-use of formulas.
+1 for using Python packages
I'm thinking about extending https://github.com/bechtoldt/vcs-gather with dependency resolution support for SaltStack formulas and Puppet modules. The metadata.json file from Puppet (https://docs.puppetlabs.com/puppet/latest/reference/modules_publishing.html#write-a-metadatajson-file) could be acceptable for it.
@bechtoldt https://github.com/westurner/pyrpo (pyrpo -s . -r sh
) and/or pypi:vcs and/or https://github.com/conda/conda/tree/master/conda (http://conda.pydata.org/docs/#requirements (pycosat) may be useful).
Conda packages have a meta.yaml
file. https://github.com/conda/conda-recipes/blob/master/requests/meta.yaml
Python packages have a pydist.json
(PEP 426)
very useful info, is any traction being put on this for next Salt release? Asking as i'm at the point where i want to move from states (where i have parent-child/ inheritance relationship) to formula based but then seeing this topic i'm worried i'll bum into a bigger problem.
Salt Formulas work great without automated dependency resolution (formula dependency management).
Here's one way to do Salt Formulas in separate repos + GItFS:
* https://github.com/saltstack-formulas/salt-formula/blob/master/salt/formulas.sls On Jun 29, 2015 6:52 AM, "Dani Comnea" notifications@github.com wrote:
very useful info, is any traction being put on this for next Salt release? Asking as i'm at the point where i want to move from states (where i have parent-child/ inheritance relationship) to formula based but then seeing this topic i'm worried i'll bum into a bigger problem.
— Reply to this email directly or view it on GitHub https://github.com/saltstack/salt/issues/12179#issuecomment-116626660.
I'm going to implement https://github.com/bechtoldt/GatherGit/issues/3 in a few weeks which will address the ideas of this issue. If you have any further comments, let me know.
That would be cool.
For test cases, you might have a look at some of the:
And a start at a test framework for salt formulas:
@westurner salt formula testing is a completely different topic, I'll cover that in https://github.com/bechtoldt/formula-docs/issues/4 :)
@bechtoldt Some tests are probably apropriate? (e.g. 'compiles' w/o syntax error, [...])
Should this/these metadata/test skeletons be standard functionality of e.g. salt.formulas or copied into every formula?
An example metadata file in https://github.com/westurner/cookiecutter-saltformula could be helpful.
ouh, you mean testing the metadata itself? of course, this will be important.
where/how do I call e.g. check_formula_metadata('./path')
,
check_formula_'importable'('name')
?
On Tue, Aug 11, 2015 at 4:31 PM, Arnold Bechtoldt notifications@github.com wrote:
ouh, you mean testing the metadata itself? of course, this will be important.
— Reply to this email directly or view it on GitHub https://github.com/saltstack/salt/issues/12179#issuecomment-130084160.
:+1:
The Salt Package Manager might be a solution for this issue in the future. I think it's still in a very early state. I'll file some feature requests.. :)
Good call @bechtoldt. Any addition thoughts about this ^^ @techhat?
Here's my thought. The reason we need dependency management is being able to 1) Easily reproduce formular used in a code base 2) Explicitly understand it's original reference location 3) Being able to compare or track down changes for formulars 4) Being able to collaborate in both formular module development and large complex formular deployment
There are many ways to implement this, the simplest way is to use git to begin with - like Golang community's Godep. There's a great advantage of this:
btw, spm looks great.
Are there any plans to have dependencies in spm?
This ticket has gone quiet, but it's still open so I'm unclear whether there's lack of consensus, or whether there's consensus that SPM has made the problem go away.
Can anyone comment on successes/failures using SPM to manage formula interdependencies?
SPM does have dependency management. An example formula might look like:
name: apache
os: RedHat, Debian, Ubuntu, Suse, FreeBSD
os_family: RedHat, Debian, Suse, FreeBSD
version: 201704
release: 1
summary: Formula for installing Apache
description: Formula for installing Apache web server
dependencies: zlib,pcre
optional: mod_perl
recommended: mod_ssl
If used, the dependencies
field contains a list of SPM packages that must be installed before this one is, and SPM will attempt to install them at the same time. optional
and recommended
are currently informational only; no enforcement currently exists.
SPM has had this since last year, so I'm going to go ahead and close this. Of course, comments are still welcome, but if there are issues with dependencies, I would rather they go in a fresh ticket.
Surely the formula should simply include states to manage its own dependencies?
I think there should be a standard way to manage formula dependencies. It is very natural for a state to be composed of other states but when using third party formula there is no easy way to manage the dependencies, versions, etc,.
A popular tool for Chef is Berkshelf. You put a file in your cookbook root (or state tree root in Salt's case) like this:
There is a CLI command so you can fetch and install all of the dependencies, for example.
I propose we do the same thing for Salt's formula:
Saltfile - This will contain the dependencies, sources, etc,. Saltfile.lock - This will contain all of the active/installed dependencies and their versions.
There will be a tool that will fetch and install the dependences, just like the berkshelf command.
I think that having this feature is really important especially as your state tree gets more advanced and you start pulling in third party formula.
Example Saltfile: