Closed gpkvt closed 9 years ago
@gpkvt Thanks for the report. It seems to me that your first state indicates that no changes were made, while your second state indicates that changes will be made. Are you running the command with a test=True
? What do your states look like? What command are your running to execute the states?
I confirm this -- I see it on any state that uses file.replace
. Running a state with test=True
produces, for example,
ID: deadline-scheduler
Function: file.replace
Name: /etc/default/grub
Result: None
Comment: No changes were made
Started: 18:31:06.400215
Duration: 1.372 ms
Changes:
Note no changes are listed. Running the state "for real" without test=True
, the ID is not listed, and no changes are made (as expected). This is a regression against 2014.1.0
@rallytime Indeed it was run with test=True
. But the cronjob is in place and would not be updated when you run the state without test=True
.
The state looks like this, all other cron.present
-States were affected as well, cron.absent
worked fine:
/usr/local/bin/ilias_cron.sh 2>/dev/null:
cron:
- present
- user: root
- minute: 55
- require:
- file: /usr/local/bin/ilias_cron.sh
It was executed by salt '*' state.highstate test=True
followed by salt '*' state.highstate
which produced the following result:
ID: /usr/local/bin/ilias_cron.sh 2>/dev/null
Function: cron.present
Result: True
Comment: Cron /usr/local/bin/ilias_cron.sh 2>/dev/null already present
Started: 20:25:42.807124
Duration: 126.945 ms
Changes:
Ah ok, I get it now. Thanks for the follow-up. So it looks like both of those states have a bug in them where when the states are run with test=True.
To summarize, the bug is that when both of these states are run with test=True
, they are reporting Result: None
which indicates that changes will be made when the state is run without test=True
. However, if changes are to take place, Comment
should report that changes will be made and Changes
should report what change will occur. If not changes are to be made, the Result
should be True
. This fix should also ensure that the states do not actually execute when test=True is applied.
I can also recreate this issue. I think the severity of this could actually be higher. It's fairly common to restart processes after file.replace
, which happened in my case.
@rallytime Just curious, why is this issue in "Blocked" milestone? What information do you need?
Let me know if you need anything.
+1 for higher severity. Having processes bounce themselves because of detected changes that didn't actually occur is bad news.
Sorry for the late reply here, @JensRantil. There isn't a "Blocked" milestone on this anymore. It was blocked when we were waiting for more info, but I switched it to "Approved" when I added the bug/severity/etc labels.
Bumping the severity now.
Could there even be a hotfix for this one, perhaps?
This may have been introduced with #12294 and fixed by dfa47a98
Could somebody else please verify that this the correct fix for the file.replace
bug and we'll backport it.
@cachedout I can test this. Just making sure, should it be applied on master or minion?
@JensRantil Minion
@JensRantil Minion
Cool. Rolled this out using the _states
directory from master. Made a change to the state comments just to make sure my execution was actually using the updated module.
This may have been introduced with #12294 and fixed by dfa47a9. Could somebody else please verify that this the correct fix for the file.replace bug and we'll backport it.
I can confirm that dfa47a9 file.replace
now correctly reports No changes needed to be made
when in fact it is so. I can also confirm that Changes would have been made
also is reported correctly when there are such. I ran these tests using salt minion state.highstate test=True
. Let me know if you'd like me to test anything else.
Also, it looks like the color of the patch now is in green (see attached screenshot). It used to be yellow along with other changes. Is that a separate issue or related?
Cool, glad the functionality is fixed. I think the colours should be a separate issue if you wouldn't mind filing it as one. If you're satisfied with where we are on this one, I'll close this one. Thanks!
:+1: Filed a separate issue about the coloring here: https://github.com/saltstack/salt/issues/19958
Just noticed this didn't make it into 2014.7.1
. Will it make it into 2014.7.2
? I believe it's an important fix to get in there.
@JensRantil It looks like that fix was made against the develop branch, and not the 2014.7 branch. I'll look at backporting #15701 so it will be available in the 2014.7.2 release.
Awesome, @rallytime! Thanks!
I don't know if it's the right place to report this but feel free to direct me otherwise (private mail to Joe Healy maybe, who maintains the Debian packges ?).
With the new Debian package 2014.7.1+ds-1 update (not the one in the official Debian repos, but the one in the official Salt repos), not only this problem wasn't fixed, but now it also happens on file.managed (in addition to file.replace). Please fix this ASAP since all those false positives highlighted in yellow, whereas no change would actually be made, make test=True pretty useless.
Hello,
regressions are painful.
Is the patch at https://github.com/saltstack/salt/pull/15701/files all that is required? to fix both the file.replace issue and file.managed issue or are they separate?
If there is a simple patch, I'm happy and able to do another set of packages tonight AU time (ie next few hours).
Cheers,
Joe
Sorry, I can't answer that. All I can say is that for weeks this happened to file.replace, and now with the latest package update it happens to file.managed too.
@MoonSweep The fix in question was not included in the 2014.7.1 release, so this isn't actually a packaging issue. This bug wasn't included in the 2014.7 branch until after 2014.7.1 was cut. It will be included in the 2014.7.2 release though with this change #20154.
Now, I haven't tested that this fix addresses file.managed, since the original issue was only for file.replace. Can you open a new issue so we can make sure that this gets addressed on its own?
@joehealy You don't need to worry about starting new packages for this issue. This particular fix will be in 2014.7.2, which should come out MUCH faster than 2014.7.1.
Done: https://github.com/saltstack/salt/issues/20220
As for the Debian packages, an update with the backported fix(es) without waiting for 2014.7.2 to be released would still be much appreciated, because Salt is currently pretty unusable as is.
Thanks for filing that new issue, @MoonSweep. As for whether or not those fixes should be backported and re-packaged, I will leave that up to @joehealy. However, please note that we should have 2014.7.2 released with in the two of weeks or so.
In general, if there is a clear and simple patch that fixes a critical issue, I'm happy to do a new package release including the patch. I get a bit reluctant when the issue is difficult to reproduce or test, the fix is not clear or there is a simple workaround.
In this case, given it is a state that is "faulty", you could put a fixed version into /srv/salt/_states/file.py and sync that out to the minions. Given this is possible, I'm actually pretty reluctant to do a new package release just for this.
Good call @joehealy. I should have suggested that earlier, but didn't think to. Thanks!
Sorry, but I'd qualify this workaround as really dirty because it would "pollute" the filesystem of the master and all minions with a file that doesn't belong to any package, and as I understand it, would supersede not only current file.py, but future versions as well, unless I remember to remove these copies when the packages are upgraded to 2014.7.2. This completely defeats the very purpose of having a packaging system in the first place. For me, it's unacceptable, so I'll just wait until 2014.7.2 is packaged and suspend all sls-writing activity until then.
FWIW, when I started using Salt some month ago, I was very enthusiastic because it's actively developped and the community is very responsive; but after using it all this time, I'm thinking more and more about choosing another configuration management software. With all the time I invested in learning Salt skills it would be a PITA, but if the general QA doesn't improve soon, I won't have any choice.
Don't get me wrong, I'm not telling this to anger you or mock you or something, but if you keep this mindset, Salt will always have this amateurish feel that currently prevents it to reach the big market. For now I use it at home to manage half a dozen hosts and VMs but in its current state I wouldn't take the risk to use it at work to manage hundreds of hosts.
@MoonSweep You might be misunderstanding how syncing custom types works. Any files synced from _states, _modules, etc. are synced within the minion cachedir (/var/cache/salt/minion/....) and used instead of the version installed by the package. They don't replace the files installed by the package, so this would not qualify as "polluting" the filesystem to me. Otherwise, any cached files, not to mention the log files created by Salt, would also have to be considered filesystem pollution.
@terminalmage hit it on the head.
And correct me if I'm wrong, but all we're talking about here is a rogue None
in a test=True
run, right? I'm struggling to see why that would cause a need to "suspend all sls-writing activity" anyway.
In any case, we're sorry that your bugfix didn't make it into 2014.7.1, but were we to never package until all the bugs were fixed, we would never get any releases out. We're hoping to have 2014.7.2 out in the next week or two, and it will include this fix. Sorry for the inconvenience.
@terminalmage No, I didn't misunderstand. If I implement this workaround, I'll have the real state file (from the package) and the one in the cachedir, in two separate places, and the latter will be used instead of the former (hence my use of the word supersede and not replace). My point is still valid: not only will this pollute the filesystem, but if I omit to remove those additional files before upgrading the package, they will be used in place of the package's ones, which is likely to bring a lot of confusion in the future.
@basepi How is one supposed to write/update/debug sls files when testing a highstate returns half of yellow output whereas only one or two files would actually be changed ? With --state-output=terse, the output messages mislead the user with a Result=differs yellow line for each file.replace and file.managed function call; you have to run the highstate with --state-output=full AND ignore/disable color output to tell if a change would actually be made or not. Admit that running an 80+ functions highstate with full output, and having to scrollback to carefully read each result instead or relying on colors, even for a limited (8 in my case) number of minions, is not a comfortable way to test sls files, to say the least. I'd rather wait for 2014.7.2 to be released and packaged, and hope that both issues will be fixed in the upgrade. If not, this will likely be the last straw for me.
Honestly, I'm just baffled at the pushback on this. I didn't think it was that inconvenient to remember to remove a file and resync. In fact, I've spoken with many users who are thrilled with the fact they're able to push modifications to states/modules/etc. like this, since it allows them to immediately apply workarounds and even copy in new functions slated for a future release so they can access them early.
I guess it's all a matter of perspective.
@MoonSweep the reason you can override core modules with the /srv/salt/_states
mechanism is for exactly this kind of thing (among other use-cases as well).
Large software projects have bugs. Salt makes this bug extremely easy to work around until a fix can be released.
If you're interested in contributing a regression test for this module we would be delighted. I'd even be happy to walk you through the process if you're interested and not already familiar with it. That is the best way to improve the quality of Salt moving forward.
@MoonSweep Another option to get your files patched, without the need to remember to remove anything just ship the patched files out via salt-state and check for a specific salt-version:
{% if salt['grains.get']('saltversion') == '2014.7.1' %}
patchsalt_file:
file:
- managed
- source: salt://minions/file.py
- name: /usr/share/pyshared/salt/modules/file.py
{% endif %}
@terminalmage Yes, I guess it's all a matter of perspective.
@whiteinge This is where we disagree. Modules are not meant to be overridden with a mechanism that's internal to Salt, because this impedes on the package manager's job. If a bug is discovered, it should be fixed and an updated version should be released, it's not up to the sysadmin to override modules to work around bugs. If a bug is laying around for too long, it means that there's a problem with the release frequency.
@gpkvt Overwriting files belonging to a package ? This is even worse than every workaround mentioned before. First of all, /usr and everything below it (except /usr/local) is package manager-land, period. It is NOT meant to be fiddled with in any way; only files in /etc and /var are meant to be manually modified by the sysadmin. Secondly, dpkg maintains a list of MD5 sums of all files managed by it. This feature is primarily used to tell if the sysadmin modified a configuration file, so as not to overwrite those changes when the package is upgraded; or to tell if a file has been altered, for example by malware. Messing with this list's integrity by modifying/overwriting something located in /usr is just asking for trouble; this could be done properly by diverting the file beforehand with dpkg-divert, so as to let the package manager know about this local modification, although it would only add to the burden of having to revert those modifications once the package is upgraded (because even with the version check, this would eventually become cruft, and have to be removed from the sls files at some point for clarity, in addition to un-diverting the file). But again, this impedes on the package manager's job.
Guys, I appreciate that you're trying your best to propose various ways of working around this bug, but like @terminalmage said, this is all a matter of perspective. Maybe I'm being overly strict here, but for a production environment, workarounds of this type are just unacceptable; only a new package with a module containing the fix would be an adequate solution.
@MoonSweep it's not a disagreement, it's a design decision. Overriding core modules has been in Salt since the beginning. You, as the sysadmin, have the choice of either working around a bug or waiting for the fix. A workaround is obviously not a replacement for packages and is, by definition, temporary.
You have chosen to wait for the fix. It's your choice and it's a fine choice. As mentioned above this fix is slated for the next point-release.
If you wish to speed a given fix, participation providing information, assistance, and attention on this issue tracker is invaluable. Salt is a large project, composed of 1100 execution functions, several hundred state functions, and many thousands of users spread across many platforms.
If you are interested in getting involved, you can read our Community Contribution Guide. Myself or anyone else on the Salt team would be more than happy to help if you have questions.
After upgrading to 2014.7.0 some of my states are reported as a change, but they didn't change anything:
This leads to false positives in a script monitoring pending changes via Icinga/Nagios.
Previous version works fine.