michellab / Sire

Sire Molecular Simulations Framework
http://siremol.org
GNU General Public License v3.0
95 stars 26 forks source link

Windows builds are failing #417

Closed lohedges closed 1 year ago

lohedges commented 1 year ago

The CI gives the following error in the log during the testing stage of the conda build process:

TEST START: D:\a\Sire\Sire\build\win-64\sire-2023.0.1-py37h82bb817_4.tar.bz2
Traceback (most recent call last):
  File "C:\Miniconda3\envs\sire_build\Scripts\conda-build-script.py", line 10, in <module>
    sys.exit(main())
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\cli\main_build.py", line 496, in main
    execute(sys.argv[1:])
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\cli\main_build.py", line 487, in execute
    verify=args.verify, variants=args.variants, cache_dir=args.cache_dir)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\api.py", line 195, in build
    variants=variants
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\build.py", line 3102, in build_tree
    test(pkg, config=metadata.config.copy(), stats=stats)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\build.py", line 2785, in test
    config)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\build.py", line 2587, in construct_metadata_for_test
    m, hash_input = _construct_metadata_for_test_from_package(recipedir_or_package, config)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\build.py", line 2465, in _construct_metadata_for_test_from_package
    recipe_dir, need_cleanup = utils.get_recipe_abspath(package)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\utils.py", line 458, in get_recipe_abspath
    conda_package_handling.api.extract(recipe, recipe_dir)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_handling\api.py", line 77, in extract
    format.extract(fn, dest_dir, components=components)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_handling\tarball.py", line 73, in extract
    streaming._extract(str(fn), str(dest_dir), components=["pkg"])
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_handling\streaming.py", line 38, in _extract
    extract_stream(stream, dest_dir)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_streaming\extract.py", line 48, in extract_stream
    tar_file.extractall(path=dest_dir, members=checked_members())
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 2002, in extractall
    numeric_owner=numeric_owner)
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 2044, in extract
    numeric_owner=numeric_owner)
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 2114, in _extract_member
    self.makefile(tarinfo, targetpath)
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 2163, in makefile
    copyfileobj(source, target, tarinfo.size, ReadError, bufsize)
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 247, in copyfileobj
    buf = src.read(bufsize)
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 537, in read
    buf = self._read(size)
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 545, in _read
    return self.__read(size)
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 570, in __read
    buf = self.fileobj.read(self.bufsize)
  File "C:\Miniconda3\envs\sire_build\lib\bz2.py", line 178, in read
    return self._buffer.read(size)
  File "C:\Miniconda3\envs\sire_build\lib\_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "C:\Miniconda3\envs\sire_build\lib\_compression.py", line 99, in read
    raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached

The same thing happened during a previous build attempt, although the error message was slightly different:

TEST START: D:\a\Sire\Sire\build\win-64\sire-2023.0.1-py37h82bb817_4.tar.bz2
Traceback (most recent call last):
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_handling\tarball.py", line 155, in extract
    _tar_xf(fn, dest_dir)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_handling\tarball.py", line 103, in _tar_xf
    archive_utils.extract_file(tarball)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_handling\archive_utils.py", line 15, in extract_file
    raise InvalidArchiveError(tarball, error_str.decode('utf-8'))
conda_package_handling.exceptions.InvalidArchiveError: Error with archive D:\a\Sire\Sire\build\win-64\sire-2023.0.1-py37h82bb817_4.tar.bz2.  You probably need to delete and re-download or re-create this file.  Message from libarchive was:

truncated bzip2 input

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Miniconda3\envs\sire_build\Scripts\conda-build-script.py", line 10, in <module>
    sys.exit(main())
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\cli\main_build.py", line 496, in main
    execute(sys.argv[1:])
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\cli\main_build.py", line 487, in execute
    verify=args.verify, variants=args.variants, cache_dir=args.cache_dir)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\api.py", line 195, in build
    variants=variants
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\build.py", line 3102, in build_tree
    test(pkg, config=metadata.config.copy(), stats=stats)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\build.py", line 2785, in test
    config)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\build.py", line 2587, in construct_metadata_for_test
    m, hash_input = _construct_metadata_for_test_from_package(recipedir_or_package, config)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\build.py", line 2465, in _construct_metadata_for_test_from_package
    recipe_dir, need_cleanup = utils.get_recipe_abspath(package)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_build\utils.py", line 458, in get_recipe_abspath
    conda_package_handling.api.extract(recipe, recipe_dir)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_handling\api.py", line 57, in extract
    format.extract(fn, dest_dir, components=components)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_handling\tarball.py", line 160, in extract
    _tar_xf_no_libarchive(fn, dest_dir)
  File "C:\Miniconda3\envs\sire_build\lib\site-packages\conda_package_handling\tarball.py", line 112, in _tar_xf_no_libarchive
    for member in tar_file.getmembers():
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 1763, in getmembers
    self._load()        # all members, we first have to
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 2350, in _load
    tarinfo = self.next()
  File "C:\Miniconda3\envs\sire_build\lib\tarfile.py", line 2281, in next
    self.fileobj.seek(self.offset - 1)
  File "C:\Miniconda3\envs\sire_build\lib\bz2.py", line 274, in seek
    return self._buffer.seek(offset, whence)
  File "C:\Miniconda3\envs\sire_build\lib\_compression.py", line 143, in seek
    data = self.read(min(io.DEFAULT_BUFFER_SIZE, offset))
  File "C:\Miniconda3\envs\sire_build\lib\_compression.py", line 99, in read
    raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached
chryswoods commented 1 year ago

A quick guess is that the Windows runner has run out of disk space during the build, and so either the tar.bz2 file wasn't fully written after the build, or it couldn't be unpacked to run the test? Maybe after a build we should run a make clean or similar to free up disk space?

lohedges commented 1 year ago

Possibly. Subsequent stages in the CI did run, though, i.e. the creation and uploading of failed build artifacts. I would have thought that this would have failed too if the disk space had been exhausted. Perhaps there is a limit for the actual build process itself?

Would we clean by just removing the Sire build directory after the installation? I guess the issue here is that we might want to examine those files if it fails, i.e. we'd want them to be included in the artifact that gets uploaded.

chryswoods commented 1 year ago

The issue does look weird. While we could spend time looking into this, I don't think fixing this would add much value? Is anyone using the Python 3.7 Windows binaries?

If not, and especially given that we will drop 3.7 support when we move to 3.10, my vote is to drop Windows 3.7 now from the CI/CD.

lohedges commented 1 year ago

Yes, it doesn't look like anyone is using py37 on any platform at present, so it might just be worth deprecating it to avoid annoying end-of-life maintenance issues over the next six months or so. Anyone who really needs it can get it touch, at which point we can try to figure out specific issues if it's not too much of a challenge.

chryswoods commented 1 year ago

Yes - let's deprecate 3.7 across all platforms. This will save the CI cycles on GH Actions and will push us to move to 3.10 once our dependencies are ready.

lohedges commented 1 year ago

Okay, this is now happening for all Windows builds with no changes to the CI. We'll need to address this as I need a working Windows conda package for Cresset to play with.

chryswoods commented 1 year ago

I think this is the key error line:

conda_package_handling.exceptions.InvalidArchiveError: Error with archive D:\a\Sire\Sire\build\win-64\sire-2023.0.1-py37h82bb817_4.tar.bz2. You probably need to delete and re-download or re-create this file. Message from libarchive was:

For some reason, conda build isn't able to create a non-corrupt sire tbz2 package after the compile. This implies that the tar process exited badly but this error wasn't caught? Did the CI give you the broken outputs to download and inspect on your computer?

It may be worth checking whether or not the sire tbz2 file is corrupt, and also how? (is it an empty file, so nothing was compressed, or does it end half way through?).

My guess is that the tar process used to create it either ran out of memory or disk space and this wasn't caught. A workaround could be us deleting the contents of the build directory after sire has installed? This should free up plenty of space.

lohedges commented 1 year ago

Good plan. I'll update the build scripts to remove the build directory only if the install succeeds. It's just strange that this was originally py37 specific, but is now occurring for all variants. (All variants are running on the same Windows image.) It makes me think that something in the conda-build chain has changed, with py37 being updated/affected earlier.

Will see how we get on.

lohedges commented 1 year ago

Yes, this has fixed the issue.