Closed dstokes closed 8 years ago
@dstokes are you able to share your deploy.rollout?
{%- set env = pillar.get('environment', '') %}
{%- set revision = pillar.get('revision', False) %}
{%- set target = "G@ec2_apps:" + pillar.tgt + " and ( G@ec2_roles:webserver or G@ec2_roles:*worker* )" %}
build_application:
salt.state:
- tgt: {{ target }}
- tgt_type: compound
- sls: deploy
{%- if revision %}
- pillar:
orchestrated: True
deploy_revision: {{ revision }}
{%- endif %}
reload_latest:
salt.state:
- tgt: {{ target }}
- tgt_type: compound
- sls: deploy.reload_latest
{%- if revision %}
- pillar:
deploy_revision: {{ revision }}
{%- endif %}
- require:
- salt: build_application
create_slack_deployment_notification:
cmd.run:
- name: <redacted>
- require:
- salt: reload_latest
Thanks for reporting this @dstokes, we'll look into it.
Is that {%- set env = pillar.get('environment', '') %}
line the culprit here? I'm trying to narrow this down to something that we can start troubleshooting effectively.
Removing everything but:
build_application:
salt.state:
- tgt: test
- sls: deploy
Doesn't fix the problem. Same error as above. There's not a new requirement on orchestration file location right?
Not that I'm aware of. :]
This should be enough to try and reproduce this, though. We'll start digging in.
we've been digging. seems like we're seeing issue #5449 which is causing orchestration failure along with a slew of other state related bugs on latest stable. Attempting to show_sls
either fails for valid state files, or loads the similarly named pillar file instead of the state file. Our roots config is as follows:
file_roots:
base:
- /srv/salt-states/states
pillar_roots:
base:
- /srv/salt-states/pillar
I think we might be seeing similar. In our dev enviroment (without gitfs) it works. In QA/Prod with gitfs it fails. When I throw the file in the filesystem like normal, it starts working in QA/Prod.
@dstokes are you using gitfs?
@iggy i am not
I am seeing something similar and it's also looks like #5449, because I'm seeing salt reference the corresponding pillar file instead of a state file when running state.show_sls. But the original issue was with orchestration as well, just like this one.
We are using multiple environments, and possibly multiple file_roots and pillar_roots for each env. There are 3 entries for each env's file_roots, but these are the same 3 entries. The pillar_roots have a single entry for base, and the other env's have it as fallback.
We're also using gitfs for several formulas, but it hasn't been a problem in the past. Only thing new is the multiple envs and *_roots.
@basepi any chance we can have a fix or a workaround anytime soon?
also, #16990 might be a duplicate
I ended up opening #19802 since I tracked down a workaround. I was using gitfs for everything and once I put copies of my orchestrate files on the actual filesystem (so, /srv/salt/backups/db/orchestrate.sls etc.) everything started working fine.
I'm trying embedding the formulas as well, but didn't manage to get it to work yet.
I am experiencing inconsistencies though - I did before as well - restarting master and minion seems to help some, but the effect doesn't stick in the long run.
Update:
The machine is it's own master/minion and the minion config has the 'environment' set to other than base.
restarting master helps in a way that may be useful for tracking it down:
I finally tracked it down, and it didn't have (much) to do with orchestration, environments or multi-roots.
My specific problem was that I was doing a jinja import in the pillar top.sls file - essentially factoring out some variables related to environments and orchestration.
Bottom line, that jinja import eventually puts the master into a buggy state where it serves the pillar_roots instead of the file_roots. Once I remove the import in pillar/top.sls and restart the master all is good again
Is orchestration working with gitfs ?
@dstokes I tried again to replicate this and could not. What is the path to rollout.sls
on your system?
@cachedout states/deploy/rollout.sls
I've hit the same problem (also on 2014.7.0), calling eg.
sudo -u salt salt-run state.orchestrate setup
currently fails with:
Data failed to compile:
----------
No matching sls found for 'setup' in env 'base'
Orchestration calls were actually working fine before I started to put some files in /srv/pillar. Adding some more logging messages to the code I could confirm that the file_roots in __opts__
being set to pillar_roots is the cause, so indeed related to #5449
After some digging, I could fix the issue as follows:
diff -c orig/salt/pillar/__init__.py /usr/lib/python2.7/dist-packages/salt/pillar/__init__.py
*** 128,134 ****
self.functions = functions
self.matcher = salt.minion.Matcher(self.opts, self.functions)
- self.rend = salt.loader.render(self.opts, self.functions)
# Fix self.opts['file_roots'] so that ext_pillars know the real
# location of file_roots. Issue 5951
ext_pillar_opts = dict(self.opts)
--- 132,138 ----
# location of file_roots. Issue 5951
ext_pillar_opts = dict(self.opts)
ext_pillar_opts['file_roots'] = self.actual_file_roots
+ self.rend = salt.loader.render(ext_pillar_opts, self.functions)
self.merge_strategy = 'smart'
if opts.get('pillar_source_merging_strategy'):
self.merge_strategy = opts['pillar_source_merging_strategy']
Regarding the way the opts are bleeding out, this seems to be happening in class Loader (of salt/loader.py). Changing the line above instead to
self.rend = salt.loader.render(dict(self.opts, __marker=1), self.functions)
and then at the end of Loader's __init__
adding on
if '__marker' in opts:
self.opts = dict(self.opts)
del self.opts['file_roots']
also fixes the error.
This logic to override mod.__opts__
with self.opts
found in gen_module
and gen_functions
should be the reason.
Finally, I suspect that the correct call may just be using the unmodified opts
, ie.
self.rend = salt.loader.render(opts, self.functions)
since we probably also want to avoid forcing file_client
to 'local', etc.
Ideally the logic to override __opts__
could be avoided though since it feels like a source of unexpected behavior waiting to bite in another way, but it is beyond my knowledge of the codebase to suggest how.
Actually, changing the definition of self.rend
either way breaks the master trying to render pillars for minions, so this does not seem like an option.
Right now I am restricting both of the mod.__opts__.update
in the Loader class to not override file_roots
(in all cases that is, not using the __marker
). This fixes the original error without introducing any apparent problems.
@dstokes, @somenick, @noegenesis - Is this still an issue for you?
Since there has been no activity for a while, I will close this issue
Rolling back to
2014.7.0rc2
fixes the issue.. Still digging for a more descriptive error.