Closed veelken closed 7 years ago
Expecting that ret value is "somevalue 0", but it has other format.
My 0.02:
The conclusions you draw from error logs in [*] are not exactly true IMO. Even though the file [1] gives an impression that hadd
command is carried out directly on /hdfs
, it's not the case. If you follow the code you can see that it calls
sbatchManager::hadd_in_cluster -> ClusterHistogramAggregator::create_output_histogram -> aggregate -> hadd_on_cluster_node -> sbatchManager::submit_job_version2
, so it always performs hadd
on /scratch
, and then copies the output file over to /hdfs
.
My guess is that there was an error on SLURM side, which didn't return anything to stdout. I followed the folder structure of your analysis on /hdfs
and saw that the shell files which do the heavy lifting of hadd
ing on /scratch
are created on the fly and immediately passed to SLURM. In practice the file seems to be empty shortly after writing it to the storage, even though OS has told us that this is not the case (fflush
and fsync
are now forced every time we write a shell script). Thus, SLURM complains that it sees an empty file, doesn't submit the job, hence nothing in stdout. The reason why the text files look like empty ones immediately after writing them could be anything, really, but the fix is placing these files to /home
instead of /hdfs
. This has worked for me so far in addMEM production. It should be fixed by now (see commit https://github.com/HEP-KBFI/tth-htt/commit/5e8e8d2d13825f4b40008f78eca53051b34c9887) but I haven't tested it..
Now that we have a bit more logging about this stuff in main stdout file, you can see what sbatch actually returns once the problem pops up again.
Regarding the second issue, I think parsing the log files makes it really complex (what to parse? what if you miss something and carry on as if everything was normal? what about the performance penalty of parsing?). In my opinion, better way is to log the return code inside the shell script that is passed to SLURM. I already see something similar done in ClusterHistogramAggregator, we just have to sprinkle more $?
inside the shell scripts (hadd
might also return a non-zero exit code!). This needs some testing, since I'm not sure what the return codes will be if the program throws. Also, ROOT masks some internal errors and just prints them on screen, but we want to catch them as well; I think we have to create some kind of callback which checks the global error code of ROOT (if there's such thing) and return non-zero status code accordingly. Finding out the scenarios which might fail or which we have to/can cover takes a bit time, but the solution itself is really simple.
[1] /home/veelken/ttHAnalysis/2016/2017Mar13/scripts/jetToTauFakeRate/sbatch_hadd_jetToTauFakeRate_stage1_DoubleMuon_Run2016E_v1_OS.py
Alright, so I did some digging and learned that in C++, any uncaught exception calls std::terminate()
, which in turn calls std::abort()
(ref1, ref2). The latter issues SIGABRT
signal with code 6; the exit code is therefore 128 + 6 = 134. You can convince yourself by running the following snippet in your terminal:
echo "int main() { throw \"uncaught\"; return 0; }" > uncaught.cpp; \
g++ -O0 uncaught.cpp -o uncaught; \
$(./uncaught &>/dev/null); \
echo $?; \
rm uncaught*
# displays 134
As for the ROOT error messages, which we don't want to ignore (I think), there's a simple fix:
#include <TError.h> // gErrorAbortLevel, kError
// ...
int
main()
{
gErrorAbortLevel = kError;
// ...
return EXIT_SUCCESS;
}
This forces ROOT to throw an exception if an error occurs. The default error level at which the binary crumbles is kSysError
, which is two levels higher than kError
(c.f. the source). If we do this in every executable that uses ROOT, the program throws e.g. when there's an input file missing (which is desired behavior IMO).
The third pieces of the puzzle -- hadd
-- sets the exit code to 0 if it succeeded and to 1 if it didn't.
To recap:
bin/
which deals w/ ROOT files;ClusterHistogramAggregator.py
such that it checks the return code of hadd
;sbatch-node.submit_job_version2_template.sh
such that it checks the return code of the main binary.By ,,check'' I mean that if an error has occured, the /scratch
subdirectory is properly cleaned up and non-zero return code is issued at the end of the job. The latter information can be useful by checking the job via sacct
which is another way of checking the job completion.
I can take up this task myself.
Aand one more thing I completely forgot: the issue of logs files under /home
. As you probably know, the log files become huge and eat up the /home
directory really fast (for instance, recent addMEM batch generated 26GB worth of log and cfg files). The solution is to archive the project & delete the archived files manually or by running this script: scripts/archive.py. The usage should be intuitive; example
export PATH_TO_ARCHIVE=~karl/addMEM/2016/2017Mar13
./scripts/archive.py -p $PATH_TO_ARCHIVE -v
This will gzip
+tar
everything under $PATH_TO_ARCHIVE
directory and places the archive there. It works only if $PATH_TO_ARCHIVE
resides on /home
. You can also run this script by copying it to the directory which you want to to archive ($PATH_TO_ARCHIVE
in our case) and simply running ./archive.py -v
.
Extracting is easy: just append -u
flag in above examples.
I might look into grep
-ping archived files, b/c unarchiving is inconvenient and eats up quarter of /home
whenever you want to revisit the log files for whatever reason. Default utilities like zgrep
aren't powerful enough, thus it must be scripted.
Hi Karl,
thank you very much for looking into this.
I am not sure I understand the comments that you make about return codes. In the case at hand, the analysis jobs finished with an exception. If I understand correctly, we need to pass the return code corresponding to the exception to the sbatchManager and then to the Makefile, right ? I agree that this would be a much better approach than parsing the log-files.
I have run ‘git pull’ yesterday evening and rerun the hadd step on /hdfs and I now see a more detailed error message. So, the problem indeed seems to be that the script for the SLURM job is not written to /hdfs "fast enough”. I fully agree with your proposal to write the hadd scripts to /home . Should this already have been the case with yesterday’s version of the software or is this still something that needs to be implemented and/or validated ? I send you the detailed error message attached below [*].
Cheers,
Christian
[*]
output_histogram=output_histogram
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 209, in hadd_on_cluster_node
output_dir=output_dir
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/sbatchManager.py", line 200, in submit_job_version2
raise IndexError("Caught an error: '%s'" % sbatch_command_result[1])
IndexError: Caught an error: 'sbatch: error: Batch script is empty!
'
Traceback (most recent call last):
File "/home/veelken/ttHAnalysis/2016/2017Mar13/scripts/jetToTauFakeRate/sbatch_hadd_jetToTauFakeRate_stage1_SingleElectron_Run2016C_v1_OS.py", line 9, in
On Mar 16, 2017, at 8:58 PM, Karl Ehatäht notifications@github.com wrote:
My 0.02:
The conclusions you draw from error logs in [*] are not exactly true IMO. Even though the file [1] gives an impression that hadd command is carried out directly on /hdfs, it's not the case. If you follow the code you can see that it calls sbatchManager::hadd_in_cluster -> ClusterHistogramAggregator::create_output_histogram -> aggregate -> hadd_on_cluster_node -> sbatchManager::submit_job_version2, so it always performs hadd on /scratch, and then copies the output file over to /hdfs.
My guess is that there was an error on SLURM side, which didn't return anything to stdout. I followed the folder structure of your analysis on /hdfs and saw that the shell files which do the heavy lifting of hadding on /scratch are created on the fly and immediately passed to SLURM. In practice the file seems to be empty shortly after writing it to the storage, even though OS has told us that this is not the case (fflush and fsync are now forced every time we write a shell script). Thus, SLURM complains that it sees an empty file, doesn't submit the job, hence nothing in stdout. The reason why the text files look like empty ones immediately after writing them could be anything, really, but the fix is placing these files to /home instead of /hdfs. This has worked for me so far in addMEM production. It should be fixed by now (see commit 5e8e8d2 https://github.com/HEP-KBFI/tth-htt/commit/5e8e8d2d13825f4b40008f78eca53051b34c9887) but I haven't tested it.. Now that we have a bit more logging about this stuff in main stdout file, you can see what sbatch actually returns once the problem pops up again.
Regarding the second issue, I think parsing the log files makes it really complex (what to parse? what if you miss something and carry on as if everything was normal? what about the performance penalty of parsing?). In my opinion, better way is to log the return code inside the shell script that is passed to SLURM. I already see something similar done in ClusterHistogramAggregator, we just have to sprinkle more $? inside the shell scripts (hadd might also return a non-zero exit code!). This needs some testing, since I'm not sure what the return codes will be if the program throws. Also, ROOT masks some internal errors and just prints them on screen, but we want to catch them as well; I think we have to create some kind of callback which checks the global error code of ROOT (if there's such thing) and return non-zero status code accordingly. Finding out the scenarios which might fail or which we have to/can cover takes a bit time, but the solution itself is really simple.
[1] /home/veelken/ttHAnalysis/2016/2017Mar13/scripts/jetToTauFakeRate/sbatch_hadd_jetToTauFakeRate_stage1_DoubleMuon_Run2016E_v1_OS.py
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/HEP-KBFI/tth-htt/issues/17#issuecomment-287158202, or mute the thread https://github.com/notifications/unsubscribe-auth/AEwCTk25wjjDRtpXdAvfZQrOuskKR9T9ks5rmYZlgaJpZM4MfLRm.
Hi,
another analysis job for which I store the .root files with the histograms on /home failed yesterday.
This is what I called “racing condition” when we talked yesterday.
It seems to me that the error occurs if there are multiple hadd jobs that start at very much the same time and try to create a directory of the same name, because the directory did not exist before the jobs were started. I think what we need to do is to
Does that make sense to you, Margus ?
Cheers,
Christian
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 27, in create_output_histogram output_histogram=output_histogram File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 209, in hadd_on_cluster_node maximum_histograms_in_batch=self.maximum_histograms_in_batch File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 63, in aggregate output_dir=output_dir File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/sbatchManager.py", line 163, in submit_job_version2 output_histogram=output_histogram File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 209, in hadd_on_cluster_node scratch_dir = self.get_scratch_dir() File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/sbatchManager.py", line 212, in get_scratch_dir output_dir=output_dir create_if_not_exists(scratch_dir) File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/jobTools.py", line 41, in create_if_not_exists File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/sbatchManager.py", line 163, in submit_job_version2 scratch_dir = self.get_scratch_dir() File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/sbatchManager.py", line 212, in get_scratch_dir create_if_not_exists(scratch_dir) File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/jobTools.py", line 41, in create_if_not_exists if not os.path.exists(dir_fullpath): os.makedirs(dir_fullpath) if not os.path.exists(dir_fullpath): os.makedirs(dir_fullpath) File "/cvmfs/cms.cern.ch/slc6_amd64_gcc493/cms/cmssw/CMSSW_8_0_19/external/slc6_amd64_gcc493/bin/../../../../../../external/python/2.7.6-kpegke/lib/python2.7/os.py", line 157, in makedirs File "/cvmfs/cms.cern.ch/slc6_amd64_gcc493/cms/cmssw/CMSSW_8_0_19/external/slc6_amd64_gcc493/bin/../../../../../../external/python/2.7.6-kpegke/lib/python2.7/os.py", line 157, in makedirs mkdir(name, mode) OSError: [Errno 17] File exists: '/scratch/veelken/tthAnalysis_2017-03-17' mkdir(name, mode) OSError: [Errno 17] File exists: '/scratch/veelken/tthAnalysis_2017-03-17' make: *** [/home/veelken/ttHAnalysis/2016/2017Mar16/histograms/2lss_1tau/histograms_harvested_stage1_5_2lss_1tau_FakeablewFakeRateWeights
Should this already have been the case with yesterday’s version of the software or is this still something that needs to be implemented and/or validated ?
From to error message it seems that you need to update your local repository once again and re-run the cfg creation. The current version doesn't include newline characters at the end of stdout/stderr messages returned by run_cmd
.
About the return codes: yes, I think the way to go is:
TBF, I'm not entirely sure about the last point, though. Needs some testing.
As for the racing condition, I'd vote for this solution which is basically the same you proposed.
Hi Karl,
I confirm that the Makefile will stop once sbatchManager returns a non-zero error code.
Margus, could you have a look at implementing this ?
Cheers,
Christian
On Mar 17, 2017, at 12:04 PM, Karl Ehatäht notifications@github.com wrote:
Should this already have been the case with yesterday’s version of the software or is this still something that needs to be implemented and/or validated ?
From to error message it seems that you need to update your local repository once again and re-run the cfg creation. The current version doesn't include newline characters at the end of stdout/stderr messages returned by run_cmd.
About the return codes: yes, I think the way to go is:
if the program throws upon some kind of error (e.g. input ROOT file doesn't exists); the return code will be a non-zero number, which can be caught in the shell scripts and return it again; this can be further propagated to sbatchManager via sacct, which lets you check the status code of the job; sbatchManager also exits with a non-zero return code; Makefile should fail, b/c it expects zero return code of each command it executes, and fails if it's not so. TBF, I'm not entirely sure about the last point, though. Needs some testing.
As for the racing condition, I'd vote for this solution http://stackoverflow.com/a/600612 which is basically the same you proposed.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/HEP-KBFI/tth-htt/issues/17#issuecomment-287314447, or mute the thread https://github.com/notifications/unsubscribe-auth/AEwCTvU2rNbshc56s6HGZEefs3rH_J5wks5rmlqXgaJpZM4MfLRm.
The race condition during the directory creation is now fixed: https://github.com/HEP-KBFI/tth-htt/commit/b98a58996269d9541eeeb56d18f302fcc42cd9bb (and tested locally, see [*]).
Coming back to the propagation of return code: if sbatchManager
might catch a non-zero return code, and therefore exit abruptly, should we scancel
all running jobs automatically or let them run (and kill them manually if needed)?
(P.S. It would be nice if we create a separate issue here for every problem we encounter. Currently we have three different problems in this thread and it's a bit cumbersome to discuss all of them at once.)
[*]
import uuid, os, errno, multiprocessing
def mkdir_p(path):
try:
os.makedirs(path)
except OSError as exc: # Python >2.5
if exc.errno == errno.EEXIST and os.path.isdir(path):
pass
else:
raise
dirnames = [str(uuid.uuid4()) for i in range(1000)]
def work(dirnames):
for d in dirnames:
mkdir_p(d)
if __name__ == '__main__':
nof_jobs = 50
p = multiprocessing.Pool(nof_jobs)
p.map(work, [dirnames for i in range(nof_jobs)])
# delete the directories with
# find . -type d | xargs -r rmdir
Hi Karl,
thank you. From my point of view there is no need to scancel all running jobs. The probability is high that many jobs will fail with the same error condition as the first job, but if they fail, they typically fail before or on the first event, so we do not waste much processing time by not scancel'ing them. Besides the implementation effort, scancel'ing all jobs may introduce a new type of problem (a temporary failure of a single job may scancel many hundred jobs that may not be affected by the temporary failure and would have run fine. Remember the problem we had in the past with some machines temporarily not being able to load CMSSW).
I tinkered with the problem, but it still needs some testing ,,in the wild''. The solution is somewhat similar to what I've described in the above posts, but I had to reconsider some points:
grep
ping squeue
by the job IDs, I found a bit more efficient solution by grep
ping the comment field, that should be assigned to every job. The output of squeue
is trimmed down as much as possible so that it could be post-processed more easily.tthAnalyzeRun*
/tthAddMEM*
files under test
directory. All jobs which are generated on the fly share the same unique comment as the ,,parent'' sbatchManager
instance.sacct
work pretty well, it might be the case that under heavily load it (or rather its backend DB that stores information about every job) simply becomes unavailable.scontrol
, but there's no easy way to reformat the huge output it gives in the same shell command and thus it must be carried out by sbatchManager
itself (i.e. it's a bit slow). scontrol
is unable to retrieve any information about recently finished jobs, because the information is not stored in its cache anymore (I have no idea, for how long or on what grounds the information is stored in the cache). There's no fallback solution to this problem, but I assume that by default the jobs finish successfully, even though we cannot verify it by the means SLURM provides (remember that there already are some safeguards in the shell wrapper and Makefile itself).grep
ping by comment here becomes unviable (especially in sacct
case), because the number of finished jobs may grow to huge numbers which makes it slow to read from subprocess.PIPE
. So I had to implement the same squeue
polling trick as in the original sbatchManager
(grep
the job ID's, but split them into reasonable-sized chunks to avoid ARG_MAX
kernel limit; the limit of 500 job IDs was too small, though).I have written a couple of test cases for it (in ./test/sbatch
):
sbatchManager
).The tests pass successfully. The only aspect I haven't tested with this approach yet is hadd
ding in sbatchManager
. But I guess if it's going to work in Ntuple/addMEM production, we can skip that.
The next step is to actually run the solution in addMEM production. If the jobs finish without any problems regarding new sbatchManager
then I think we can replace the original sbatchManager
class with the new one.
edit: a quick test shows that scontrol
holds information about a job up to 15 minutes since the time it finished. We poll every 30 seconds for recently finished jobs so we never need a third fallback solution (unless there's a limit on the number of jobs in the cache). Quick googling about the config settings in SLURM yielded nothing.
edit2: OK so I tested again and it looks like the jobs are indeed kept 15 mins (plus some change) in the cache. The SLURM variable that controls it is perhaps MinJobAge
(currently set to 900, see /etc/slurm/slurm.conf
). The user guide (pdf) seems to confirm my suspicions about sacct
and scontrol
:
SLURM provides a built-in caching capability as a redundancy mechanism in the event that communication with the database is lost. When communication with either the slurmdbd or the mysqld is lost, the SLURM controller caches accounting information. One of the key components in this approach is the usage of the directory designated by the SLURM.conf control argument 'StateSaveLocation'. If the SlurmCTLD is unable to communicate with the SlurmDBD, it utilizes an internal cache until SlurmDBD is returned to service or until the cache is filled. Observations show that there is space in this cache for approximately 5000 simple jobs.
TBF, the figure 5000 is kind of arbitrary in my opinion. But hopefully it's a non-issue because the number of jobs that could successfully finish in less than a minute is probably less than O(5k) anyways. Or if it becomes a problem we have to partition our jobs such that we won't submit that many jobs in one go or something.
Switched over to the new version of sbatchManager. Tests passed and tried it out on addMEM & analyze_2lss_1tau jobs.
Hi Margus,
the hadd step does not seem to work if the .root files produced by running the analysis code are stored on /hdfs .The error message is attached below [*] (from /hdfs/local/ttH_2tau/veelken/ttHAnalysis/2016/2017Mar13/stderr_jetToTauFakeRate.log )
A separate problem (not a bug, but a feature that is important is missing) is that problems that occur when running the analysis code [] are not logged in the main stderr file /hdfs/local/ttH_2tau/veelken/ttHAnalysis/2016/2017Mar13/stderr_jetToTauFakeRate.log and do not cause an immediate abort of the workflow. The way how I discovered the problem [] was that some histograms were missing in the final (large) hadd file. I then needed to look at the intermediate (smaller) hadd files and finally at the .root files that are produced when running the analysis code and the analysis log-file [] to find the problem. The error [] is fatal in the sense that there is no chance that the workflow can finish successfully once a single analysis job aborts with an exception. It would be very useful if you could implement a logic that once an analysis job finishes on the batch system, the log-file produced by the log is checked and the workflow terminated immediately in case an exception occured (or the .root file produced by the analysis job is corrupt).
Cheers,
Christian
[*]
Traceback (most recent call last): File "/home/veelken/ttHAnalysis/2016/2017Mar13/scripts/jetToTauFakeRate/sbatch_hadd_jetToTauFakeRate_stage1_DoubleMuon_Run2016E_v1_OS.py", line 8, in
waitForJobs=True,
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/sbatchManager.py", line 73, in hadd_in_cluster
cluster_histogram_aggregator.create_output_histogram()
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 25, in create_output_histogram
maximum_histograms_in_batch=self.maximum_histograms_in_batch
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 61, in aggregate
output_histogram=output_histogram
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 207, in hadd_on_cluster_node
output_dir=output_dir
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/sbatchManager.py", line 194, in submit_job_version2
job_id = sbatch_command_result.split()[-1]
IndexError: list index out of range
make: [/hdfs/local/ttH_2tau/veelken/ttHAnalysis/2016/2017Mar13/histograms/jetToTauFakeRate/histograms_harvested_stage1_jetToTauFakeRate_DoubleMuon_Run2016E_v1_OS.root] Error 1
make: Waiting for unfinished jobs....
Traceback (most recent call last):
File "/home/veelken/ttHAnalysis/2016/2017Mar13/scripts/jetToTauFakeRate/sbatch_hadd_jetToTauFakeRate_stage1_MuonEG_Run2016C_v1_OS.py", line 8, in
waitForJobs=True,
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/sbatchManager.py", line 73, in hadd_in_cluster
cluster_histogram_aggregator.create_output_histogram()
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 25, in create_output_histogram
maximum_histograms_in_batch=self.maximum_histograms_in_batch
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 87, in aggregate
output_histogram=final_output_histogram
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/ClusterHistogramAggregator.py", line 207, in hadd_on_cluster_node
output_dir=output_dir
File "/home/veelken/VHbbNtuples_8_0_x/CMSSW_8_0_19/python/tthAnalysis/HiggsToTauTau/sbatchManager.py", line 194, in submit_job_version2
job_id = sbatch_command_result.split()[-1]
IndexError: list index out of range
make: *** [/hdfs/local/ttH_2tau/veelken/ttHAnalysis/2016/2017Mar13/histograms/jetToTauFakeRate/histograms_harvested_stage1_jetToTauFakeRate_MuonEG_Run2016C_v1_OS.root] Error 1
[**]
/home/veelken/ttHAnalysis/2016/2017Mar13/logs/jetToTauFakeRate/OS/TTZToLL_M10_ext1/analyze_jetToTauFakeRate_TTZToLL_M10_ext1_OS_central_2_executable.log