Open Mekk opened 8 years ago
On performance; We were facing performance issues on one of our weaker nodes with a fair amount of Python packages which essentially was caused solely because of pip freeze being called repeatedly to check for present installations and so on for each package to install.
In order to resolve this I wrote a small wrapper around the stankevich-python and a custom fact.
Important to note is that this was for a masterless Puppet installation (pretty much like this I guess https://www.digitalocean.com/community/tutorials/how-to-set-up-a-masterless-puppet-environment-on-ubuntu-14-04), I have no clue if this has any impact on whether or not the solution is worthwhile for other installations.
Fact:
# Extracts python modules and puts them into a hash
# containing the name and version
modules = Hash.new
if Facter::Util::Resolution.which('pip')
pip_output = `pip freeze`
python_modules = pip_output.split("\n")
python_modules.each { |package|
parts = package.split("==")
modules[parts[0].downcase] = parts[1]
}
Facter.add('python_modules') do
setcode do
modules
end
end
end
Note that this doesn't execute if it can't find pip.
Here's a small sample from the resulting manifest, showing how the fact was used:
define install_python_module ($ensure) {
if $python_modules {
case $ensure {
'present': {
unless $title in $python_modules {
python::pip { $title: ensure => 'present' }
}
}
'latest': {
python::pip { $title: ensure => 'latest' }
}
'absent': {
if $title in $python_modules {
python::pip { $title: ensure => 'absent' }
}
}
default: {
# Deal with versions here
#
# I think the following snippet should work but I've not tested it
#
# if $title in $python_modules and $ensure != $python_modules[$title] {
# python::pip { $title: ensure => $ensure }
# }
notice("Version handling is not implemented yet")
}
}
}
else {
python::pip { $title: ensure => $ensure }
}
}
$present_packages = [ 'requests', 'pyyaml','validate-email', 'pycrypto', 'mock','iso8601']
install_python_module { $present_packages:
ensure => "present"
}
Note the check for whether or not the python_modules variable is not undef due to the above mentioned dependency on pip to set up the fact, not the most glorious solution I've come up with but it works.
Maybe this is something the module could continue building upon. Perhaps it works the current way because of cross-compatibility issues or something similar. In my specific use case on Ubuntu it did however lower the execution time of repeated puppet apply down to a few seconds down from one or two minutes. Obviously first time setups are still slow but that doesn't affect our production environment.
+1 I'm running puppet masterless on low-performance embedded systems and these pip freezes take up well over 50% of a run where nothing needs to change.
To follow up on @Deus-Vult workaround above, here's a python external structured fact that you can put in a module (eg. \<module>/facts.d/pip.py):
#!/usr/bin/env python
import subprocess
modules = {}
pipoutput = subprocess.Popen(["pip", "freeze"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
pips = pipoutput.communicate()[0].split()
for pip in pips:
[package,version] = pip.split("==")
modules[package)] = version
print "python_modules="+str(modules)
Note older versions of puppet need to set this in puppet.conf [main]:
stringify_facts = false
+1
Summary
I faced permission problems which in the end turned out to be caused by
python::pip
callingpip freeze
callinghg showconfig
on modules unrelated to the installation in charge. While problems as such are specific to my somewhat specific way of using puppet, they can be easily resolved, and fixes make things working much faster in the general case.Appetizer
Permission problem
I use some Puppet manifest to support managing my development environment. It's light case, without any puppet master, with Puppet serving as a helper just to ensure various packages are installed. So I do things like
I agree that this is not perfectly safe way of doing things, but that's also not a production…
With
python::pip
installed packages it used to work ……… until I performed some manualpip install --user --edit
(on completely unrelated modules). Since then, Puppet manifests started to ignorepython::pip
modules, and debugging output showed up info like:(here I am
python::pip
-ing cram which is in no way related to locally installedextension_utils
)In the end it turns out that things works so:
python::pip
usespip freeze | grep Package
(or| grep Package==X.Y.Z
) to check whether package is installedpip freeze
scans all locally installed packages (and that's my fault that it scans packages installed on my local account, assudo
does not clear environment well enough, and I wouldn't open this ticket unless there were other reasons to avoid calling it)pip freeze
intends to emit exact specification of how to reinstall any and all of those packages, for every one of them it looks for best way of how to specify the dependency-e hg+ssh://master/mercurial/extension_utils@f90a7710d6b2cba2cce9dbe61e5af9f7a33e52c3#egg=mercurial_extension_utils-default
hg
(orgit
if it were git repo) to ask it for various details (aforementionedhg showconfig paths.default
prints default repository remote, which in typical case is the url repo was cloned from), and this commands fails due to my sudocram
) is installed and in which version.Suggested fix
Simpler but less effective version: swapping
pip freeze
topip list
and patching$grep_regexp
to accomodate different syntax (pip list
emits things likecram (0.6)
instead ofcram==0.6
emitted bypip freeze
resolves my permission issue but does not offer noticeable performance gains (btw, for locally installed dirs the output is much more useful, for examplemercurial-extension-utils (1.0.0, /home/marcink/DEV_hg/mercurial/extension_utils)
). To get that fix I simply replacedfreeze
withlist
and edited the regexp this waypip show PKGNAME
instead, see appetizer above for explanation why.unless
is needed at all. Blindly callingpip install PkgName==Version
may be faster than checking whether this call is needed. For example, when packagecram
is alredy installed at version 0.6, I have:so, as we can see the actual cost of checking whether pip install is needed is exactly equal to just calling it.