Closed mhakala closed 8 years ago
Maybe something like https://github.com/beefsack/git-mirror could be used to:
Specific suggestions to reduce load on github are most welcome.
Why not, can be tested. If it works then why not.
If we would use git mirrors then in the requirements.yml files we can change the URL to http://{{ install_node }}/gitmirror/account/ansible-role-nhc or some such
Started work on this at https://github.com/jabl/ansible-role-gitmirror (just the bare skeleton yet).
Well, https://github.com/jabl/ansible-role-gitmirror should now work based on some limited testing. It downloads the git-mirror release tarball, unpacks it and copies the binary to a suitable location, creates a systemd unit file, creates a user&group for running the daemon, a git-mirror config file containing repos to mirror and then finally starts the whole shebang.
The remaining issue is how to do the deployment/bootstrapping in a sensible way. Initially, we need the current requirements.yml in order to pull in the required repos (including andible-role-gitmirror!). But then after the install node is installed, we want to switch to using the mirror (assuming we want to run the mirror on the install node). So maybe we need to create some requirements_mirror.yml or something like that? In the worst case we'd have to keep the repos we use up to date in 3 places, requirements.yml, requirements_mirror.yml, and in the group_vars for ansible-role-gitmirror to use when generating the git-mirror config file. Ideally we should somehow keep the list of repos we want in a single location, but how to do that in a good way? Any suggestions?
One solution would be to just keep the repos in a single location for compute nodes. That is group_vars. 1) ansible-playbook install.yml could get the repos from here and setup mirror on install based on this information 2) ansible-playbook install.yml could also write /var/www/html/requirements.yml (to be used with ansible pull) based on the repos in group_vars 3) finally ansible-pull-scrips.sh could be modified to use the requirements.yml on install node.
The other nodes (admin and install) will need a separete file to bootstrap all.
2016-02-27 22:40 GMT+02:00 Janne Blomqvist notifications@github.com:
Well, https://github.com/jabl/ansible-role-gitmirror should now work based on some limited testing. It downloads the git-mirror release tarball, unpacks it and copies the binary to a suitable location, creates a systemd unit file, creates a user&group for running the daemon, a git-mirror config file containing repos to mirror and then finally starts the whole shebang.
The remaining issue is how to do the deployment/bootstrapping in a sensible way. Initially, we need the current requirements.yml in order to pull in the required repos (including andible-role-gitmirror!). But then after the install node is installed, we want to switch to using the mirror (assuming we want to run the mirror on the install node). So maybe we need to create some requirements_mirror.yml or something like that? In the worst case we'd have to keep the repos we use up to date in 3 places, requirements.yml, requirements_mirror.yml, and in the group_vars for ansible-role-gitmirror to use when generating the git-mirror config file. Ideally we should somehow keep the list of repos we want in a single location, but how to do that in a good way? Any suggestions?
— Reply to this email directly or view it on GitHub https://github.com/CSC-IT-Center-for-Science/fgci-ansible/issues/73#issuecomment-189720316 .
Mikko Hakala mikko.h.hakala@gmail.com 045 - 678 9757
Sounds good. I was toying with the idea of creating the vars for the mirroring by parsing the requirements.yml but it seems difficult.
So two lists: requirements.yml and group_vars?
Is it possible to set version(tag/commit) on each repo? ansible galaxy takes care of that.
Pushed a suggestion to a branch in the fgci-install role: https://github.com/CSC-IT-Center-for-Science/ansible-role-fgci-install/commit/e263f9c3826872852139588fd5fc7c2b6a287f27
Thoughts?
It copies requirements.yml to the install node and then replaces all instances of https://github.com with http://pull_install_ip. Top of the requirements.yml looks like below. I guess we could change defaults so that they look in http://10.1.1.2/gitmirror/ ?
/edit: Just now noticed that git-mirror serves the mirrors too - it runs its own web server. I guess we could change httpd.conf on install to proxy the traffic or just point clients to http://10.1.1.2:8080/
Now to make the mirrors, it would be really nice one could modify requirements.yml into a gitmirror config file. Any ideas for how to do that?
---
- src: http://10.1.1.2/resmo/ansible-role-ntp
path: roles
version: 0.4.0
- src: http://10.1.1.2/CSC-IT-Center-for-Science/ansible-role-fgci-install
path: roles
# called -2 because it replaces another role called ansible-role-yum-cron
- src: http://10.1.1.2/jeffwidman/ansible-yum-cron
path: roles
name: ansible-role-yum-cron-2
version: 9d587da913eaa82349e86b4fb9d691818538963b
Yeah, git-mirror runs its own web server (on port 8080 by default). Another thing with the urls is that it encodes the hostname in the path. So e.g. http://github.com/foo/bar.git becomes http://{{ install_ip }}:8080/github.com/foo/bar.git, although this can be changed in the config file (see the "name" directive).
And, it should be possible to parse the requirements.yml and then generate a git-mirror config.toml as well as the requirements_mirror.yml from that. If nothing else, there's the python yaml parser which I guess ansible itself uses. But I have no idea how baroque the yaml parsing api is..
Another idea would be to have another role (be it ansible-role-fgci-install or e.g. ansible-role-gitmirror-fgciconfig or such) that would define the repos in defaults/main.yml and then from that it should be easy to generate the requirements.yml that uses the mirror with a jinja2 template. A question though, can one role pickup defaults from another role? That is, if we have the mirrors defined in ansible-role-gitmirror-fgciconfig, will ansible-role-gitmirror pick them up and generate a config file with all the repos or will it use its own example config from its own defaults/main.yml? Using group_vars for this isn't that good because then we'd have to rely on every fgci site doing changes themselves or then stuff will mysteriously start breaking..
Edit Ok, so basic usage of PyYAML is pretty simple. Lets see if I can cook something up..
Roles in the same playbook can use eachother's default variables.
Ok, these 2 pulls should do it:
https://github.com/CSC-IT-Center-for-Science/ansible-role-fgci-install/pull/5 https://github.com/CSC-IT-Center-for-Science/fgci-ansible/pull/101
Everything is sort of tested individually, but not together so there might be some more-or-less trivial bugs left.
Things are merged. Is the ansible pull traffic now only internal?
After this change it looks like ansible-pull script is calling 10.1.1.2 and grabbing roles from there so that looks good. I had to run "ansible-playbook install.yml -t gitmirror,fgci-install" to get all the updates applied and after an ansible-pull run or two things looks quite nice.
Big thanks to everybody involved!
All the git cloning/pulling etc., yes. Though there is still some connecting to the external world, check e.g. with
strace -econnect -f /usr/local/bin/ansible-pull-script.sh 2>&1 |grep connect|grep -v AF_LOCAL|grep -v 10.10.254.20
(replace 10.10.254.20 with your pull_install_ip).
That being said, ansible-pull-script now runs a lot faster than before, since the cloning everything from github was really slow, so I'm not sure it's worth spending a lot of time on chasing what's leftover.
Edit: Seems the culprit is the line
/usr/bin/ansible-pull -s $time -U http://10.10.254.20:8080/github.com/CSC-IT-Center-for-Science/fgci-ansible.git -C production -i /root/hosts
in ansible-pull-script.
Currenlty, when executing ansible-pull-script.sh this fetches the whole repository of roles with "git clone". This has drawback of being slow, flooding github (especially if there is no cache). Can this behaviour be replaces. E.g. generate the latest repository as tar.gz or provide some cached location for entire FGCI consortium.