Closed benjaminfuchs closed 5 years ago
The atomicwrites
package is a great find, I didn't even know that exists!
The patch seems plausible to me, merging this. Thanks!
Hi, just checked this patch with this simple test:
import concurrent.futures
import tempfile
import os
#os.environ['TRACKER_ENABLED']="1"
from clcache.__main__ import (
Statistics,
)
STATS_FILE = 'C:\\Users\\im\\clcache\\stats.txt' #problem
#STATS_FILE = tempfile.NamedTemporaryFile(dir='C:\\Users\\im\\clcache', suffix='.txt').name #problem
#STATS_FILE = tempfile.NamedTemporaryFile(suffix='.txt').name #no problem
#STATS_FILE = tempfile.NamedTemporaryFile(dir='C:\\Users\\im\\clcache').name #problem without txt extension too
#STATS_FILE = 'C:\\clcache\\stats.txt' #no problem
def change_stats():
stats = Statistics(STATS_FILE)
with stats.lock, stats as s:
s.registerCallWithInvalidArgument()
if __name__ == '__main__':
print(STATS_FILE)
while True:
with concurrent.futures.ThreadPoolExecutor(4) as executor:
futures = [executor.submit(change_stats) for _ in range(1000)]
for future in concurrent.futures.as_completed(futures):
future.result()
Without this fix:
C:\Users\im\clcache\stats.txt
Traceback (most recent call last):
File "..\awtest.py", line 29, in <module>
future.result()
File "C:\Python34\lib\concurrent\futures\_base.py", line 395, in result
return self.__get_result()
File "C:\Python34\lib\concurrent\futures\_base.py", line 354, in __get_result
raise self._exception
File "C:\Python34\lib\concurrent\futures\thread.py", line 54, in run
result = self.fn(*self.args, **self.kwargs)
File "..\awtest.py", line 20, in change_stats
s.registerCallWithInvalidArgument()
File "C:\Python34\lib\site-packages\clcache\__main__.py", line 795, in __exit__
self._stats.save()
File "C:\Python34\lib\site-packages\clcache\__main__.py", line 703, in save
json.dump(self._dict, f, sort_keys=True, indent=4)
File "C:\Python34\lib\contextlib.py", line 66, in __exit__
next(self.gen)
File "C:\Python34\lib\site-packages\clcache\__main__.py", line 162, in atomicWrite
os.replace(tempFileName, fileName)
PermissionError: [WinError 5] Отказано в доступе: 'C:\\Users\\im\\clcache\\stats.txt.new' -> 'C:\\Users\\im\\clcache\\stats.txt'
With this fix:
C:\repository\git\clcache>\Python34\python.exe ..\awtest.py
C:\Users\im\clcache\stats.txt
Traceback (most recent call last):
File "..\awtest.py", line 29, in <module>
future.result()
File "C:\Python34\lib\concurrent\futures\_base.py", line 395, in result
return self.__get_result()
File "C:\Python34\lib\concurrent\futures\_base.py", line 354, in __get_result
raise self._exception
File "C:\Python34\lib\concurrent\futures\thread.py", line 54, in run
result = self.fn(*self.args, **self.kwargs)
File "..\awtest.py", line 20, in change_stats
s.registerCallWithInvalidArgument()
File "C:\Python34\lib\site-packages\clcache\__main__.py", line 749, in __exit__
self._stats.save()
File "C:\Python34\lib\site-packages\clcache\__main__.py", line 659, in save
json.dump(self._dict, f, sort_keys=True, indent=4)
File "C:\Python34\lib\contextlib.py", line 66, in __exit__
next(self.gen)
File "C:\Python34\lib\site-packages\atomicwrites\__init__.py", line 152, in _open
self.commit(f)
File "C:\Python34\lib\site-packages\atomicwrites\__init__.py", line 177, in commit
replace_atomic(f.name, self._path)
File "C:\Python34\lib\site-packages\atomicwrites\__init__.py", line 89, in replace_atomic
return _replace_atomic(src, dst)
File "C:\Python34\lib\site-packages\atomicwrites\__init__.py", line 71, in _replace_atomic
_windows_default_flags | _MOVEFILE_REPLACE_EXISTING
File "C:\Python34\lib\site-packages\atomicwrites\__init__.py", line 66, in _handle_errors
raise WinError()
PermissionError: [WinError 5] Отказано в доступе.
@izmmisha Thanks for the test. I did some test runs with it. Both implementations with 200.000 calls to change_stats()
:
old implementation: 10 fails
new implementation: 2 fails
Did you also see such kind of frequency of fails?
I think the race condition that tempFileName
was always stats.txt.new
for every thread is solved.
But there is still a race condition if 2 threads try to write the new content to stats.txt
.
I think this can only be solved by some kind of mutex guarding (what is aleardy done in some other parts of clcache).
It need more investigation, the problem exists only in user home directory (except temp directory), the simplest solution right now is move clcache folder out of user home directory.
@izmmisha Looking at this post https://github.com/nodejs/node-gyp/issues/1124 maybe the default CLCACHE_DIR
location should be in %LOCALAPPDATA%
on windows? I ran the test with this folder and didn't get any errors. No idea what magic happens in %USERPROFILE%
.
When can this be released?
Issue
We are sometimes seeing following exception when we run our build with many cores:
Searching a bit makes it look like there is a race condition:
Looking with that in mind at the
atomicWrite
function:Solution
Replacing the
atomicWrite
function withatomic_write
from the atomicwrites package seems to solve the issue (https://pypi.org/project/atomicwrites/). The package takes care of such race conditions so we no longer have to think about it.