Open knightsc opened 6 years ago
Hi @knightsc ,
thanks for your detailed problem description. Ansible also counts the random delay, we are using to prevent DDoS behavior of the AD and GIT infrastructure, when it measures the run-time. This might explain your long run-time. We add a random delay with 15min max when running as launch daemon. We do not set the random delay if you trigger ANTS on the CLI.
The reason you only see the output in the log after each task finished is due to the way we write the log. I will discuss this with @aschwanb. There might be a nicer way to do this which will result in an "realtime" log.
Cheers, Jan
I hope you're using tail -f
but joking aside.
From the output of your log file it looks to me as if the run triggered by ants.run-now
is only logging the beginning of the run and ants -w
is logging the rest. This would explain why there is no change in the git checkout.
For context: Loading both LaunchDaemons will result in triggering ants -w
. Touching the file at /Users/Shared/.ch.unibas.its.cs.ants.run-now
will trigger ants
without delay.
There we have a situation where both processes are writing to the same log files. We're using the python logging library and I've to test if there is a potential problem when two processes are using the same file.
Am I correct in assuming that you're loading both LaunchDaemons after the pkg is installed and you're not doing a restart? If so, I'll test that scenario.
I did some basic testing on macOS 10.13.6: Open three terminal windows. One with sudo ants -w
another with sudo ants
and a third with tail -f /var/log/ants/ants.log
. The log shows the output of sudo ants
.
@knightsc Can you run the same test and confirm my results? I think we might need the detailed steps on how to reproduce your issue.
I also ran touch /Users/Shared/.ch.unibas.its.cs.ants.run-now && tail -f /var/log/ants/ants.log
and could see the log in "real time" - with the delay of the python subprocess output, which is still less than the 13 minutes you experienced.
I think there might be a problem during the bootstrapping procedure when ants
is just installed.
I am definitely using tail -f
. I will repeat some of my testing on a real machine rather than a virtual machine to see if that makes any difference
I just did the same test as above in my 10.14 VM and coudn't reproduce the behaviour. Can you check if the issue is with the run-now
LaunchDaemon?
I just tested touch /Users/Shared/.ch.unibas.its.cs.ants.run-now && tail -f /var/log/ants/ants.log
on my physical 10.13.6 machine and see the same behavior. Not as long as a delay as the VM. But the start of the run has the timestamp Jul 26 2018 08:58:57 EDT
and then every single line of the output log has the timestamp Jul 26 2018 08:59:51 EDT
. It just doesn't seem possible for everything to be happening at the exact same time. In your log files are there different timestamps on the out?
I also test my playbook changes locally using a regular playbook command like this:
sudo ansible-playbook -i hosts -c local -l
hostnamesite.yml
And when I do the output scrolls by like I expect on the console.
When you say check if the issue is with the run-now
launch daemon what should I be looking for?
This is the plist for the run-now daemon
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>KeepAlive</key>
<dict>
<key>PathState</key>
<dict>
<key>/Users/Shared/.ch.unibas.its.cs.ants.run-now</key>
<true/>
</dict>
</dict>
<key>Label</key>
<string>ch.unibas.its.cs.ants.run-now</string>
<key>OnDemand</key>
<true/>
<key>ProgramArguments</key>
<array>
<string>bash</string>
<string>-c</string>
<string>/Library/ANTS-Framework/bin/ants; /bin/rm /Users/Shared/.ch.unibas.its.cs.ants.run-now</string>
</array>
<key>StandardErrorPath</key>
<string>/var/log/system.log</string>
<key>StandardOutPath</key>
<string>/var/log/system.log</string>
</dict>
</plist>
Interestingly enough it looks like the output from a run also gets put into /var/log/system.log
(my guess because of the daemon) and even in that file I see all the info show up at once at the end of the running rather than all throughout the run
@knightsc Yes, this is the daemon I was refering to. I could have given the full path, sorry.
When you first opened the issue, I thought it was about a delay of 13 minutes for an ants
run triggered by touching /Users/Shared/.ch.unibas.its.cs.ants.run-now
. This is very different from logs all having the same timestamp but occuring within one minute after the start of the run. Is the original issue still present or was it about the timestampes all along?
Original issue is there. Definitely after a trigger a run, if I tail the ants.log i get no output until the run is completely finished and all entries show at once. I think they all have the same timestamp because they're all getting printed at once at the end rather than being printed as the run progresses and having different timestamp values
@aschwanb If you check the logs of one of your runs is all the ansible output with the same timestamp?
They don't all have the same timestamp. But there are definitely blocks of output. Eg. one block where all lines have timestap x
, then a block with timestamp y
. I think the reason for this is the way python buffers and outputs lines from subprocess.Popen
.
The reason we're using subprocess
instead of importing ansible directly is here. The relevant code starts here:
proc = subprocess.Popen(cmd, bufsize=1, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
and
for line in iter(proc.stdout.readline, ''):
logger.write_log(line, task_line)
You can maybe take a look at the bufsize
options and see if you can improve on the output speed.
After the ANTS Framework is first installed I touch the run now file to kick off an initial run. I then go and
tail /var/log/ants/ants.log
and get the following:And the log just stays like that 13 minutes. My playbook uses the
macos_profile
ANTs role and during this time I can see the profiles installed one by one in system preferences. Finally after the run is finished the ants.log file show the output for the rest of the run.I briefly discussed this in the #ansible channel in MacAdmins slack here: https://macadmins.slack.com/archives/C06G3E39P/p1528319226000036
I didn't get a chance to look into it in more detail then so creating this issue so I can take another look.