easybuilders / easybuild-framework

EasyBuild is a software installation framework in Python that allows you to install software in a structured and robust way.
https://easybuild.io
GNU General Public License v2.0
148 stars 201 forks source link

Option download_filename not working when sources come from URL #2303

Open paulmelis opened 7 years ago

paulmelis commented 7 years ago

According to http://easybuild.readthedocs.io/en/latest/Writing_easyconfig_files.html#alternative-formats-for-sources it should be possible to use the key download_filename in a sources dict to specify the file name to which a downloaded file is saved.

But looking at EasyBlock.obtain_file() (https://github.com/easybuilders/easybuild-framework/blob/c1dabed9d6882f34f351f754bcd529b5a0b5f5a7/easybuild/framework/easyblock.py#L554) the parameter download_filename isn't used in case the source comes from a URL. This matches the behaviour I'm seeing with EB 3.4.0: the fetched file isn't renamed and unpacking fails because it doesn't have an extension:

[paulm@gcn64 easybuild]$ less /tmp/paulm/eb-pve0HA/easybuild-OpenColorIO-1.0.9-20170913.150000.pBqqd.log
...
== 2017-09-13 15:00:00,471 environment.py:97 INFO Environment variable LD_PRELOAD set to  (previous value: '')
== 2017-09-13 15:00:00,471 environment.py:97 INFO Environment variable LD_PRELOAD_modshare set to  (previous value: '')
== 2017-09-13 15:00:00,472 easyblock.py:2419 INFO Running method make_builddir part of step ready
== 2017-09-13 15:00:00,472 filetools.py:1071 INFO Creating directory /tmp/paulm/OpenColorIO/1.0.9/intel-2016b (parents: True, set_gid: True, sticky: False)
== 2017-09-13 15:00:00,473 filetools.py:953 INFO Adjusting permissions recursively for /tmp/paulm/OpenColorIO
== 2017-09-13 15:00:00,473 easyblock.py:2419 INFO Running method env_reset_changes part of step ready
== 2017-09-13 15:00:00,473 easyblock.py:2419 INFO Running method handle_iterate_opts part of step ready
== 2017-09-13 15:00:00,474 build_log.py:232 INFO unpacking...
== 2017-09-13 15:00:00,474 easyblock.py:2416 INFO Starting source step
== 2017-09-13 15:00:00,474 easyblock.py:2419 INFO Running method checksum_step part of step source
== 2017-09-13 15:00:00,474 easyblock.py:1641 INFO Checksum verification for /home/paulm/.local/easybuild/sources/o/OpenColorIO/v1.0.9 using None passed.
== 2017-09-13 15:00:00,475 easyblock.py:2419 INFO Running method extract_step part of step source
== 2017-09-13 15:00:00,475 easyblock.py:1648 INFO Unpacking source v1.0.9
== 2017-09-13 15:00:00,531 build_log.py:157 ERROR EasyBuild crashed with an error (at ?:124 in __init__): Unknown file type for file v1.0.9 (at easybuild/tools/filetools.py:729 in extract_cmd)
== 2017-09-13 15:00:00,531 easyblock.py:2650 WARNING build failed (first 300 chars): Unknown file type for file v1.0.9
== 2017-09-13 15:00:00,532 easyblock.py:287 INFO Closing log for application name OpenColorIO version 1.0.9

EB config to reproduce:

easyblock = 'CMakeMake'

name = 'OpenColorIO'
version = '1.0.9'

homepage = 'http://opencolorio.org/'
description = """OpenColorIO (OCIO) is a complete color management solution geared towards motion picture production with an emphasis on visual effects and computer animation."""

toolchain = {'name': 'intel', 'version': '2016b'}

# http://github.com/imageworks/OpenColorIO/tarball/v1.0.9

# As of easybuild 3.3.0 the following is supported
source_urls = ['http://github.com/imageworks/OpenColorIO/tarball/']
sources = [{
    'filename': 'v%(version)s', 
    'download_filename': 'v%(version)s.tar.gz'
}]

# EB < 3.3.0: download manually and rename to include .tar.gz extension
#sources = ['opencolorio-v1.0.9.tar.gz']

dependencies = []       # Woohoo! ;-)

builddependencies = [('CMake', '3.6.1')]

separate_build_dir = True

#configopts = '-DSTOP_ON_WARNING=OFF'

sanity_check_paths = {
    'files': ['bin/ociocheck', 'lib/libOpenColorIO.so'],
    'dirs': ['include/OpenColorIO', 'share'],
}

moduleclass = 'lib'
boegel commented 6 years ago

@paulmelis It seems like you got things backwards... download_filename is the name of the file under which it can be downloaded, filename is the name of the file that EasyBuild should store it with. So you got the values reversed?

paulmelis commented 6 years ago

Right, it seems I indeed got the arguments switched. Flipping them indeed succeeds in downloading.

I also think I understand why my mistake happened. In the normal case (when the filename in the URL has an extension and so can be unpacked) you only specify the mandatory filename value, which together with source_urls specifies the URL to use and also the name to save to. In the case where the URL doesn't contain a file extension the optional extra argument download_filename in my mind specified the name under which to save the file locally, as the URL part is already covered by filename and I only need to specify the exception. But in the latter case filename changes meaning in that it has nothing to do with the URL anymore and you need to switch to download_filename for that part. So, that turned out (for me) to be somewhat non-intuitive.

paulmelis commented 6 years ago

I actually still don't understand that this works, as the comment above on obtain_file() still holds: download_filename isn't used when filename is a URL. But I'm a bit tired and might not be clear enough at the moment to figure this out.

boegel commented 6 years ago

@paulmelis Your point is still valid about download_filename not being used when the source filename being provided is a URL by itself.

This relates to a use case like this, where source_urls is not used at all:

sources = ['http://www.bzip.org/1.0.6/bzip2-1.0.6.tar.gz']

We don't use this in the central easyconfig repository though...

So, how would we take download_filename into account there? I don't think it would make much sense, since you are already explicitly specifying the filename you are downloading with?

W.r.t. your comment about the semantics of filename changing if download_filename is specified: I see it differently... The filename value indicates the filename that EasyBuild should use to i) store the file, ii) look for the file later (before trying to download it). By default, that's also the filename to use when downloading the file, unless download_filename is specified...