Open CiderMan opened 1 year ago
To add what I said in the referenced email thread: can reproduce this, and believe, though don't have definitive proof, that the presence of two variantdirs is also a necessary part of the triggering - however, I've not added the variantdir label onto this issue (yet).
OK, I've monkey patched os.unlink
and can see where the file is getting deleted:
Retrieved `build2\main.obj' from cache
Unlinking build2\main.obj
File "C:\Python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\tmp\repro\.venv\Scripts\scons.exe\__main__.py", line 7, in <module>
sys.exit(main())
File "c:\tmp\repro\.venv\lib\site-packages\SCons\Script\Main.py", line 1412, in main
_exec_main(parser, values)
File "c:\tmp\repro\.venv\lib\site-packages\SCons\Script\Main.py", line 1374, in _exec_main
_main(parser)
File "c:\tmp\repro\.venv\lib\site-packages\SCons\Script\Main.py", line 1154, in _main
nodes = _build_targets(fs, options, targets, target_top)
File "c:\tmp\repro\.venv\lib\site-packages\SCons\Script\Main.py", line 1349, in _build_targets
jobs.run(postfunc = jobs_postfunc)
File "c:\tmp\repro\.venv\lib\site-packages\SCons\Taskmaster\Job.py", line 123, in run
self.job.start()
File "c:\tmp\repro\.venv\lib\site-packages\SCons\Taskmaster\Job.py", line 221, in start
task.execute()
File "c:\tmp\repro\.venv\lib\site-packages\SCons\Script\Main.py", line 195, in execute
SCons.Taskmaster.OutOfDateTask.execute(self)
File "c:\tmp\repro\.venv\lib\site-packages\SCons\Taskmaster\__init__.py", line 242, in execute
t.fs.unlink(t.get_internal_path())
File "c:\tmp\repro\.venv\lib\site-packages\SCons\Node\FS.py", line 1184, in unlink
return os.unlink(path)
File "C:\tmp\repro\SConstruct", line 8, in _unlink
traceback.print_stack()
In other words, it is the t.fs.unlink
in the following piece of code:
try:
cached_targets = []
for t in self.targets:
if not t.retrieve_from_cache():
break
cached_targets.append(t)
if len(cached_targets) < len(self.targets):
# Remove targets before building. It's possible that we
# partially retrieved targets from the cache, leaving
# them in read-only mode. That might cause the command
# to fail.
#
for t in cached_targets:
try:
t.fs.unlink(t.get_internal_path())
except (IOError, OSError) as e:
SCons.Warnings.warn(SCons.Warnings.CacheCleanupErrorWarning,
"Failed copying all target files from cache, Error while attempting to remove file %s retrieved from cache: %s" % (t.get_internal_path(), e))
self.targets[0].build()
else:
for t in cached_targets:
t.cached = 1
except SystemExit:
Dumping out the values of cached_targets
and self.targets
at the time of the erroneous deletion:
cached_targets = ['build2\\main.obj']
self.targets = ['build2\\main.obj', 'build2\\foo.obj']
I'm guessing that this intends to rebuild main.obj
but, because of batching and because main.obj
is not part of the batch due to it being up-to-date, it doesn't get rebuilt?
I hope that this is of some help...
Hi @bdbaddog,
Doing a clean build of one of our build products as an example:
We could certainly disable MSVC_BATCH when building with a cache but we would need to analyse the results to be sure that the cache provides more speed up than the removal of batching causes slow down.
Would it be possible to implement the following logic for each batch:
While it is sub-optimal, I suspect that it would give a lot of the performance benefit of each mechanism while simplifying the work that needs to be done...
P.S. For reference, I used the non-MSVC_BATCH build to --cache-populate
and then performed a clean build with the cache enabled (so everything was in the cache) and it took 3m18.218s. Note that all the builds are parallel builds on a 8 (logical) core i7-6700.
Describe the bug We have recently switched on CacheDir in our CI machine and, while this has largely been a positive experience, we have seen the odd link failure that we could not explain. I was eventually able to reproduce the issue and produce a minimal SConstruct that illustrates the problem. It is an interaction between the MSVS batching and the cache.
Required information
To reproduce the problem, I have created two C files and a SConstruct:
src/main.c
src/foo.c
SConstruct
The steps to reproduce the issue are:
scons
) to populate the cachebuild1/main.obj
Running
scons
, I would expectbuild1/main.obj
to be retrieved from the cache,foo.c
to be builts to producebuild1/foo.obj
andbuild2/foo.obj
before the two Programs are linked. What I actually get is a build failure:At this point
build2/main.obj
has been removed (hence the link failure) but running scons again succeeds: