robertwb / issues-import-test

0 stars 0 forks source link

CyCache open to filesystem races #990

Open robertwb opened 10 years ago

robertwb commented 10 years ago

I occasionally get the following error from the Sage buildbot slaves on the UW cluster:

Found compiled sage/plot/plot3d/shapes.pyx in cache
Traceback (most recent call last):
  File "/scratch/buildbot/sage/redhawk-1/sage_git/build/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 962, in cythonize_one_helper
    return cythonize_one(*m[1:])
  File "/scratch/buildbot/sage/redhawk-1/sage_git/build/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 913, in cythonize_one
    shutil.copyfileobj(g, f)
  File "/scratch/buildbot/sage/redhawk-1/sage_git/build/local/lib/python/shutil.py", line 49, in copyfileobj
    buf = fsrc.read(length)
  File "/scratch/buildbot/sage/redhawk-1/sage_git/build/local/lib/python/gzip.py", line 261, in read
    self._read(readsize)
  File "/scratch/buildbot/sage/redhawk-1/sage_git/build/local/lib/python/gzip.py", line 296, in _read
    self._read_gzip_header()
  File "/scratch/buildbot/sage/redhawk-1/sage_git/build/local/lib/python/gzip.py", line 190, in _read_gzip_header
    raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file
Cythonizing sage/plot/plot3d/transform.pyx
Traceback (most recent call last):
  File "setup.py", line 565, in <module>
    run_cythonize()
  File "setup.py", line 557, in run_cythonize
    'profile': profile,
  File "/scratch/buildbot/sage/redhawk-1/sage_git/build/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 816, in cythonize
    result.get(99999)  # seconds
  File "/scratch/buildbot/sage/redhawk-1/sage_git/build/local/lib/python/multiprocessing/pool.py", line 558, in get
    raise self._value
IOError: Not a gzipped file

The buildslaves use the same account on a shared NFS file system, and in particular share the ~/.cycache directory. The offending line is clearly unsafe:

            g = gzip_open(fingerprint_file, 'wb')
            try:
                shutil.copyfileobj(f, g)
            finally:
                g.close()

IMHO cached files must always be changed atomically. That is, write to a temporary file in the same filesystem (usually: same directory) and then rename it. NFS probably exacerbates the issue but it is a race condition even without it.

Migrated from http://trac.cython.org/ticket/837

robertwb commented 9 years ago

@jdemeyer changed Also interrupting at the wrong time can cause this.