aboutcode-org / scancode-toolkit

:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ... to discover and inventory open source and third-party packages used in your code. Sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase, the Google Summer of Code, Azure credits, nexB and others generous sponsors!
https://aboutcode.org/scancode/
2.13k stars 548 forks source link

scancode doesn't work in centos 6.6 with glibc 2.12 installed #834

Closed spbkelt closed 6 years ago

spbkelt commented 7 years ago
/usr/share/scancode-toolkit/scancode --format html-app --diag --timeout 3600 -n 4 --ignore "*.war" --ignore "*.zip" --ignore "*.jar" . oss-report.html
Scanning files for: licenses, copyrights, packages with 4 process(es)...
Building license detection index...Traceback (most recent call last):
  File "/usr/share/scancode-toolkit/bin/scancode", line 11, in <module>
    load_entry_point('scancode-toolkit', 'console_scripts', 'scancode')()
  File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 722, in call
    return self.main(*args, **kwargs)
  File "/usr/share/scancode-toolkit/src/scancode/utils.py", line 74, in main
    standalone_mode=standalone_mode, **extra)
  File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), args, *kwargs)
  File "/usr/share/scancode-toolkit/src/scancode/cli.py", line 490, in scancode
    pre_scan_plugins=pre_scan_plugins)
  File "/usr/share/scancode-toolkit/src/scancode/cli.py", line 572, in scan
    get_index(False)
  File "/usr/share/scancode-toolkit/src/licensedcode/cache.py", line 188, in get_index
    _LICENSES_INDEX = get_or_build_index_through_cache()
  File "/usr/share/scancode-toolkit/src/licensedcode/cache.py", line 108, in get_or_build_index_through_cache
    from licensedcode.index import LicenseIndex
  File "/usr/share/scancode-toolkit/src/licensedcode/index.py", line 47, in <module>
    from licensedcode import match
  File "/usr/share/scancode-toolkit/src/licensedcode/match.py", line 36, in <module>
    from licensedcode import query
  File "/usr/share/scancode-toolkit/src/licensedcode/query.py", line 32, in <module>
    import typecode
  File "/usr/share/scancode-toolkit/src/typecode/__init__.py", line 27, in <module>
    from typecode.contenttype import get_type
  File "/usr/share/scancode-toolkit/src/typecode/contenttype.py", line 47, in <module>
    from typecode import magic2
  File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 221, in <module>
    libmagic = load_lib()
  File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 214, in load_lib
    lib = ctypes.CDLL(magic_so)
  File "/usr/local/lib/python2.7/ctypes/__init__.py", line 365, in init
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/share/scancode-toolkit/src/typecode/bin/linux-64/lib/libmagic.so)

GLIBC version is:

ldd --version
ldd (GNU libc) 2.12
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

Tried to install mnaually

#install glibc 2.14
mkdir -p /glibc_install && \
cd /glibc_install && \
wget -q http://ftp.gnu.org/gnu/glibc/glibc-2.14.tar.gz && \
tar zxf glibc-2.14.tar.gz && \
cd glibc-2.14 && \
mkdir -p build && \
cd build && \
../configure --prefix=/opt/glibc-2.14 && \
mkdir -p /opt/glibc-2.14/etc && \
touch /opt/glibc-2.14/etc/ld.so.conf && \
make -j4 && \
make install && \
export LD_LIBRARY_PATH=/opt/glibc-2.14/lib

Still no luck...

pombredanne commented 7 years ago

Thanks for the report and sorry for this issue! There is similar issue at #463 and may be #443 It comes from the shameful and terrible fact that the codebase contains pre-built binaries "(here you are stumbling on libmagic) and these may not have been built with old enough version of Linux. See also #469

It may solvable with some links per @akaihola in #443 but I am not sure.

It starts working correctly if I run this:

sudo ln -s /usr/lib64/libbz2.so.1 /usr/lib64/libbz2.so.1.0

Is there anything special about your centos 6.6 installation? e.g. If I spin a VM or a container, this is a vanilla one?

pombredanne commented 7 years ago

The culprits shared objects and binaries are in your case:

Technically these were built exactly from https://github.com/nexB/scancode-thirdparty-src

So the process to solve all this mess is

  1. 469 which is the right and long term solution

  2. short term, rebuilding on your OS the three libraries above and replace the ScanCode ones.
spbkelt commented 7 years ago

Is there anything special about your centos 6.6 installation? e.g. If I spin a VM or a container, this is a vanilla one?

We have centos:6.6 as base image for our container with build agent where we run scancode. So we have a lot of software there installed. Not pure vanilla definitely.

469 which is the right and long term solution

From #469 i got that you still don't have RPM/DEB. So we don't have such solution

short term, rebuilding on your OS the three libraries above and replace the ScanCode ones.

Could you please provide actual build scripts? And how to replace/bundle everything after that ?

pombredanne commented 7 years ago

You wrote:

Could you please provide actual build scripts? And how to replace/bundle everything after that

I hate this! but I created this mess in the first place so this is the least I could do for you. It should be straight ./configure && make ... and then copy the bits in the rights places.... But let me craft this for you :)

spbkelt commented 7 years ago

But let me craft this for you :)

Awesome. It's your mess so ...please :)

And why did you mention those libraries above? I have error about libmagic.so

pombredanne commented 7 years ago

We need them all for things to work. Here is the thing

git clone https://github.com/nexB/scancode-toolkit.git
git clone https://github.com/nexB/scancode-thirdparty-src.git
pushd scancode-thirdparty-src
./build.sh
popd
# check the diff
pushd scancode-toolkit
git status
# run the tests on 4 CPUs
./configure --clean && ./configure
bin/py.test -vvs -n4

It is OK if test_scan_can_handle_weird_file_names fails.

pombredanne commented 7 years ago

note these build scripts are also prep work needed for #469 :) .... so this is timely

spbkelt commented 7 years ago

I have these tests failured

XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_copr3_correct
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_copr2_correct
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_germany_should_detect_trailing_city
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_in_html_comments
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_in_windows_binary_lib
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_java
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_copr5_correct
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_in_dll_exact
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_in_html_incorrect
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_json_phps_html
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_license__qpl_v1_0_perfect
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_license_text_doc
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_url_in_html
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_json_in_phps
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_piersol
  reason: 
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_license_text_scilab
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_dbus_dbus_dbus_sha_c_trail_name
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_chromium_chrome_common_extensions_docs_examples_apps_hello_python_httplib2_init_py_extra_contributors
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_guava_guava_ipr_markup
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_kernel_headers_original_linux_cdrom_h_trail_email
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_lohit_fonts_notice_trail_url
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_bluetooth_bluez_audio_gateway_c_trail_name
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_srec_tools_grxmlcompile_grxmlcompile_cpp
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_quake_quake_src_qw_client_menu_c_trail_name
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_tcpdump_print_rx_c_trail_name
  reason: 
XFAIL tests/cluecode/test_finder.py::TestUrl::test_misc_valid_unicode_or_punycode_urls_that_should_pass
  reason: 
XFAIL tests/cluecode/test_holders.py::TestHolders::test_holder_multiline
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_tcpdump_ieee802_11_h_trail_email
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_tcpdump_print_snmp_c_trail_name_lead_name_trail_name_complex
  reason: 
XFAIL tests/extractcode/test_extract.py::TestExtract::test_extract_directory_of_windows_ar_archives
  reason: 
XFAIL tests/extractcode/test_extract.py::TestExtract::test_extract_with_kinds
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_freetype_src_base_ftbase_h_trail_name
  reason: 
XFAIL tests/extractcode/test_patch.py::TestPatchInfoFailing::test_patch_info_patch_patches_misc_webkit_opensource_patches_sync_xhr_patch
  reason: 
XFAIL tests/extractcode/test_patch.py::TestPatchInfoFailing::test_patch_info_patch_patches_problematic_opensso_patch
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_kernel_headers_original_linux_netfilter_xt_connmark_h_trail_url
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_libvpx_examples_includes_geshi_docs_geshi_doc_txt_trail_email_trail_url_misc
  reason: 
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_lohit_fonts_lohit_bengali_ttf_copyright_trail_url
  reason: 
XFAIL tests/cluecode/test_finder.py::TestUrl::test_misc_invalid_urls_that_crash
  reason: 
XFAIL tests/cluecode/test_finder.py::TestUrl::test_misc_valid_urls_that_should_pass
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_d_zlib_and_gfdl_1_2_and_gpl_and_gpl_and_other_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_eclipse_openj9_html_html
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_flt9_gif
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_gpl_2_0_and_lppl_1_3c_and_public_domain_1_copyright
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_gpl_2_0_and_lppl_1_3c_and_public_domain_label
  reason: 
XFAIL tests/typecode/test_contenttype.py::TestContentType::test_build_makefile_inc_is_not_povray
  reason: 
XFAIL tests/typecode/test_contenttype.py::TestContentType::test_code_groff
  reason: 
XFAIL tests/typecode/test_contenttype.py::TestContentType::test_filetype_file_on_unicode_file_name2
  reason: 
XFAIL tests/typecode/test_contenttype.py::TestContentType::test_text_rsync_file_is_not_octet_stream
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_aes_128_3_0_and_bsd_new_and_bsd_original_uc_and_bsd_simplified_and_other_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_apache_1_1_and_apache_2_0_and_cpl_1_0_and_epl_1_0_and_other_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_apache_2_0_and_apache_2_0_and_bsd_new_and_gpl_and_other_1_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_apache_2_0_and_apache_2_0_and_bsd_new_and_gpl_and_other_2_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_apache_2_0_and_apache_2_0_and_bsd_new_and_gpl_and_other_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_boost_1_0_and_bsd_simplified_and_cddl_1_0_and_gpl_2_0_classpath_and_other_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_bsd_new_and_bsd_new_and_bsd_new_and_bsd_new_and_other_1_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_bsd_new_and_bsd_simplified_and_lgpl_and_lgpl_2_0_plus_and_other_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_bsd_new_and_gpl_2_0_and_gpl_3_0_and_public_domain_and_other_txt
  reason: 
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_bsd_original_and_bsd_simplified_and_mit_and_mit_and_other_txt
  reason: 

=================================== FAILURES ===================================
___________ TestPermissions.test_copyfile_does_not_keep_permissions ____________
[gw1] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_copyfile_does_not_keep_permissions>

    def test_copyfile_does_not_keep_permissions(self):
        src_file = self.get_temp_file()
        dest = self.get_temp_dir()
        with open(src_file, 'wb') as f:
            f.write('')
        try:
            make_non_readable(src_file)
            if on_posix:
>               assert not filetype.is_readable(src_file)
E               AssertionError: assert not True
E                +  where True = <function is_readable at 0x7feb75141c80>('/tmp/scancode_root/tst/ UhH1AU/6_pAoB/td/tf.txt')
E                +    where <function is_readable at 0x7feb75141c80> = filetype.is_readable

tests/commoncode/test_fileutils.py:156: AssertionError
__________________ TestPermissions.test_chmod_read_write_file __________________
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_chmod_read_write_file>

    def test_chmod_read_write_file(self):
        test_dir = self.get_test_loc('fileutils/executable', copy=True)
        test_file = join(test_dir, 'deep1', 'deep2', 'ctags')

        try:
            make_non_writable(test_file)
>           assert not filetype.is_writable(test_file)
E           AssertionError: assert not True
E            +  where True = <function is_writable at 0x390ccf8>('/tmp/scancode_root/tst/ dMZXPe/OX2wOp/executable/deep1/deep2/ctags')
E            +    where <function is_writable at 0x390ccf8> = filetype.is_writable

tests/commoncode/test_fileutils.py:122: AssertionError
____________ TestPermissions.test_copytree_copies_unreadable_files _____________
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_copytree_copies_unreadable_files>

    def test_copytree_copies_unreadable_files(self):
        src = self.get_test_loc('fileutils/exec', copy=True)
        dst = self.get_temp_dir()
        src_file1 = join(src, 'a.bat')
        src_file2 = join(src, 'subtxt', 'a.txt')

        try:
            # make some unreadable source files
            make_non_readable(src_file1)
            if on_posix:
>               assert not filetype.is_readable(src_file1)
E               AssertionError: assert not True
E                +  where True = <function is_readable at 0x390cc80>('/tmp/scancode_root/tst/ dMZXPe/Z_L8Rs/exec/a.bat')
E                +    where <function is_readable at 0x390cc80> = filetype.is_readable

tests/commoncode/test_fileutils.py:204: AssertionError
_______________ TestSevenZip.test_extract_7z_with_relative_path ________________
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_archive.TestSevenZip testMethod=test_extract_7z_with_relative_path>

    def test_extract_7z_with_relative_path(self):
        test_file = self.get_test_loc('archive/7z/7zip_relative.7z')
        test_dir = self.get_temp_dir()
        result = archive.extract_7z(test_file, test_dir)
        non_result = os.path.join(test_dir, '../a_parent_folder.txt')
        assert not os.path.exists(non_result)
        assert [] == result
        extracted = self.collect_extracted_path(test_dir)
        expected = [
            '/dotdot/',
            '/dotdot/2folder/',
            '/dotdot/2folder/3folder/',
            '/dotdot/2folder/3folder/relative_file',
            '/dotdot/2folder/3folder/relative_file~',
            '/dotdot/2folder/relative_file',
            '/dotdot/relative_file'
        ]
>       assert expected == extracted
E       AssertionError: assert ['/dotdot/', ...ve_file', ...] == []
E         Left contains more items, first extra item: '/dotdot/'
E         Full diff:
E         + []
E         - [u'/dotdot/',
E         -  u'/dotdot/2folder/',
E         -  u'/dotdot/2folder/3folder/',
E         -  u'/dotdot/2folder/3folder/relative_file',
E         -  u'/dotdot/2folder/3folder/relative_file~',
E         -  u'/dotdot/2folder/relative_file',
E         -  u'/dotdot/relative_file']

tests/extractcode/test_archive.py:1682: AssertionError
 TestExtractArchiveWithIllegalFilenamesWithLibarchiveOnLinux.test_extract_7zip_with_weird_filenames_with_libarchive 
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_archive.TestExtractArchiveWithIllegalFilenamesWithLibarchiveOnLinux testMethod=test_extract_7zip_with_weird_filenames_with_libarchive>

    def test_extract_7zip_with_weird_filenames_with_libarchive(self):
        test_file = self.get_test_loc('archive/weird_names/weird_names.7z')
>       self.check_extract(libarchive2.extract, test_file, expected_warnings=[], expected_suffix='libarch')

tests/extractcode/test_archive.py:2145: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/extractcode/test_archive.py:2109: in check_extract
    warnings = test_function(test_file, test_dir)
src/extractcode/libarchive2.py:144: in extract
    for entry in list_entries(abs_location):
src/extractcode/libarchive2.py:169: in list_entries
    for entry in archive:
src/extractcode/libarchive2.py:243: in iter
    r = next_entry(self.archive_struct, entry_struct)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

rc = -30, archive_func = <_FuncPtr object at 0x41d9600>
args = (101088368, 112322176), null = False

    def errcheck(rc, archive_func, args, null=False):
        """
        ctypes error check handler for functions returning int, or null if null is True.
        """
        if null:
            if rc is None:
                archive_struct = args and len(args) > 1 and args[0] or None
                raise ArchiveError(rc, archive_struct, archive_func)
            else:
                return rc

        if rc >= ARCHIVE_OK:
            return rc

        archive_struct = args[0]
        if rc == ARCHIVE_RETRY:
            raise ArchiveErrorRetryable(rc, archive_struct, archive_func)

        if rc == ARCHIVE_WARN:
            raise ArchiveWarning(rc, archive_struct, archive_func)

        # anything else is a serious error, in general not recoverable.
>       raise ArchiveError(rc, archive_struct, archive_func)
E       ArchiveError: LZMA codec is unsupported

src/extractcode/libarchive2.py:453: ArchiveError
 TestExtractArchiveWithIllegalFilenamesWithLibarchiveOnLinuxWarnings.test_extract_7zip_with_weird_filenames_with_libarchive 
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_archive.TestExtractArchiveWithIllegalFilenamesWithLibarchiveOnLinuxWarnings testMethod=test_extract_7zip_with_weird_filenames_with_libarchive>

    def test_extract_7zip_with_weird_filenames_with_libarchive(self):
        test_file = self.get_test_loc('archive/weird_names/weird_names.7z')
>       self.check_extract(libarchive2.extract, test_file, expected_warnings=[], expected_suffix='libarch')

tests/extractcode/test_archive.py:2145: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/extractcode/test_archive.py:2109: in check_extract
    warnings = test_function(test_file, test_dir)
src/extractcode/libarchive2.py:144: in extract
    for entry in list_entries(abs_location):
src/extractcode/libarchive2.py:169: in list_entries
    for entry in archive:
src/extractcode/libarchive2.py:243: in iter
    r = next_entry(self.archive_struct, entry_struct)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

rc = -30, archive_func = <_FuncPtr object at 0x41d9600>
args = (93179216, 97408624), null = False

    def errcheck(rc, archive_func, args, null=False):
        """
        ctypes error check handler for functions returning int, or null if null is True.
        """
        if null:
            if rc is None:
                archive_struct = args and len(args) > 1 and args[0] or None
                raise ArchiveError(rc, archive_struct, archive_func)
            else:
                return rc

        if rc >= ARCHIVE_OK:
            return rc

        archive_struct = args[0]
        if rc == ARCHIVE_RETRY:
            raise ArchiveErrorRetryable(rc, archive_struct, archive_func)

        if rc == ARCHIVE_WARN:
            raise ArchiveWarning(rc, archive_struct, archive_func)

        # anything else is a serious error, in general not recoverable.
>       raise ArchiveError(rc, archive_struct, archive_func)
E       ArchiveError: LZMA codec is unsupported

src/extractcode/libarchive2.py:453: ArchiveError
__________________ TypeTest.test_is_readable_is_writeable_dir __________________
[gw2] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_filetype.TypeTest testMethod=test_is_readable_is_writeable_dir>

    def test_is_readable_is_writeable_dir(self):
        base_dir = self.get_test_loc('filetype/readwrite', copy=True)
        test_dir = os.path.join(base_dir, 'sub')

        try:
            assert filetype.is_readable(test_dir)
            assert filetype.is_writable(test_dir)

            make_non_readable(test_dir)
            if on_posix:
>               assert not filetype.is_readable(test_dir)
E               AssertionError: assert not True
E                +  where True = <function is_readable at 0x7f1ba0244c80>('/tmp/scancode_root/tst/ wZKyXy/nth2d7/readwrite/sub')
E                +    where <function is_readable at 0x7f1ba0244c80> = filetype.is_readable

tests/commoncode/test_filetype.py:111: AssertionError
_________ TestPermissions.test_chmod_read_write_non_recursively_on_dir _________
[gw2] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_chmod_read_write_non_recursively_on_dir>

    def test_chmod_read_write_non_recursively_on_dir(self):
        test_dir = self.get_test_loc('fileutils/executable', copy=True)
        test_file = join(test_dir, 'deep1', 'deep2', 'ctags')
        test_dir = join(test_dir, 'deep1', 'deep2')
        parent = join(test_dir, 'deep1')

        try:
            # setup
            make_non_writable(test_file)
>           assert not filetype.is_writable(test_file)
E           AssertionError: assert not True
E            +  where True = <function is_writable at 0x7f1ba0244cf8>('/tmp/scancode_root/tst/ wZKyXy/lXWRvj/executable/deep1/deep2/ctags')
E            +    where <function is_writable at 0x7f1ba0244cf8> = filetype.is_writable

tests/commoncode/test_fileutils.py:95: AssertionError
_____ TestPermissions.test_copytree_does_not_keep_non_writable_permissions _____
[gw2] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_copytree_does_not_keep_non_writable_permissions>

    def test_copytree_does_not_keep_non_writable_permissions(self):
        src = self.get_test_loc('fileutils/exec', copy=True)
        dst = self.get_temp_dir()

        try:
            src_file = join(src, 'subtxt/a.txt')
            make_non_writable(src_file)
>           assert not filetype.is_writable(src_file)
E           AssertionError: assert not True
E            +  where True = <function is_writable at 0x7f1ba0244cf8>('/tmp/scancode_root/tst/ wZKyXy/OnUCnq/exec/subtxt/a.txt')
E            +    where <function is_writable at 0x7f1ba0244cf8> = filetype.is_writable

tests/commoncode/test_fileutils.py:172: AssertionError
_________________ TypeTest.test_is_readable_is_writeable_file __________________
[gw3] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_filetype.TypeTest testMethod=test_is_readable_is_writeable_file>

    def test_is_readable_is_writeable_file(self):
        base_dir = self.get_test_loc('filetype/readwrite', copy=True)
        test_file = os.path.join(os.path.join(base_dir, 'sub'), 'file')

        try:
            assert filetype.is_readable(test_file)
            assert filetype.is_writable(test_file)

            make_non_readable(test_file)
            if on_posix:
>               assert not filetype.is_readable(test_file)
E               AssertionError: assert not True
E                +  where True = <function is_readable at 0x2e9ec80>('/tmp/scancode_root/tst/ zmhaL0/NQuSjT/readwrite/sub/file')
E                +    where <function is_readable at 0x2e9ec80> = filetype.is_readable

tests/commoncode/test_filetype.py:94: AssertionError
___________ TestPermissions.test_chmod_read_write_recursively_on_dir ___________
[gw3] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_chmod_read_write_recursively_on_dir>

    def test_chmod_read_write_recursively_on_dir(self):
        test_dir = self.get_test_loc('fileutils/executable', copy=True)
        test_file = join(test_dir, 'deep1', 'deep2', 'ctags')
        test_dir2 = join(test_dir, 'deep1', 'deep2')
        parent = join(test_dir, 'deep1')

        try:
            make_non_writable(test_file)
>           assert not filetype.is_writable(test_file)
E           AssertionError: assert not True
E            +  where True = <function is_writable at 0x2e9ecf8>('/tmp/scancode_root/tst/ zmhaL0/3jv4ZH/executable/deep1/deep2/ctags')
E            +    where <function is_writable at 0x2e9ecf8> = filetype.is_writable

tests/commoncode/test_fileutils.py:65: AssertionError
____________________ test_scan_can_handle_weird_file_names _____________________
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
@skipIf(on_windows, 'This test cannot run on windows as these are not legal file names.')
    def test_scan_can_handle_weird_file_names():
        test_dir = test_env.extract_test_tar('weird_file_name/weird_file_name.tar.gz')
        result_file = test_env.get_temp_file('json')

        result = run_scan_click(['-c', '-i', '--strip-root', test_dir, result_file])
        assert result.exit_code == 0
        assert "KeyError: 'sha1'" not in result.output
        assert 'Scanning done' in result.output

        # Some info vary on each OS
        # See https://github.com/nexB/scancode-toolkit/issues/438 for details
        if on_linux:
            expected = 'weird_file_name/expected-linux.json'
        elif on_mac:
            expected = 'weird_file_name/expected-mac.json'
        else:
            raise Exception('Not a supported OS?')
>       check_json_scan(test_env.get_test_loc(expected), result_file, regen=False)

tests/scancode/test_cli.py:581: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

expected_file = '/opt/Python-2.7.6/scancode-toolkit/tests/scancode/data/weird_file_name/expected-linux.json'
result_file = '/tmp/scancode_root/tst/ dMZXPe/Ftcjfc/td/tf.json', regen = False
strip_dates = False

    def check_json_scan(expected_file, result_file, regen=False, strip_dates=False):
        """
        Check the scan result_file JSON results against the expected_file expected JSON
        results. Removes references to test_dir for the comparison. If regen is True the
        expected_file WILL BE overwritten with the results. This is convenient for
        updating tests expectations. But use with caution.
        """
        result = _load_json_result(result_file)
        if strip_dates:
            remove_dates(result)
        if regen:
            with open(expected_file, 'wb') as reg:
                json.dump(result, reg, indent=2, separators=(',', ': '))
        expected = _load_json_result(expected_file)
        if strip_dates:
            remove_dates(expected)

        # NOTE we redump the JSON as a string for a more efficient comparison of
        # failures
        expected = json.dumps(expected, indent=2, sort_keys=True, separators=(',', ': '))
        result = json.dumps(result, indent=2, sort_keys=True, separators=(',', ': '))
>       assert expected == result
E       assert '{\n  "files":...": true\n  }\n}' == '{\n  "files": ...": true\n  }\n}'
E           {
E             "files": [
E               {
E                 "base_name": "some 'file",
E                 "copyrights": [],
E                 "date": "2016-12-21",
E                 "extension": "",
E                 "file_type": "POSIX shell script, ASCII text executable",
E                 "files_count": null,
E                 "is_archive": false,
E                 "is_binary": false,
E                 "is_media": false,
E                 "is_script": true,
E                 "is_source": true,
E                 "is_text": true,
E                 "md5": "62c4cdf80d860c09f215ffff0a9ed020",
E                 "mime_type": "text/x-shellscript",
E                 "name": "some 'file",
E                 "path": "some 'file",
E                 "programming_language": "Bash",
E                 "scan_errors": [],
E                 "sha1": "715037088f2582f3fbb7e9492f819987f713a332",
E                 "size": 20,
E                 "type": "file"
E               },
E               {
E                 "base_name": "some \\file",
E                 "copyrights": [],
E                 "date": "2016-12-21",
E                 "extension": "",
E                 "file_type": "POSIX shell script, ASCII text executable",
E                 "files_count": null,
E                 "is_archive": false,
E                 "is_binary": false,
E                 "is_media": false,
E                 "is_script": true,
E                 "is_source": true,
E                 "is_text": true,
E                 "md5": "e99c06d03836700154f01778ac782d50",
E                 "mime_type": "text/x-shellscript",
E                 "name": "some \\file",
E                 "path": "some /file",
E                 "programming_language": "Bash",
E                 "scan_errors": [],
E                 "sha1": "73e029b07257966106d79d35271bf400e3543cea",
E                 "size": 21,
E                 "type": "file"
E               },
E               {
E                 "base_name": "some file",
E                 "copyrights": [],
E                 "date": "2016-12-21",
E                 "extension": "",
E         -       "file_type": "Node.js script, ASCII text executable",
E         ?                     ^   ---
E         +       "file_type": "a /usr/bin/env node script, ASCII text executable",
E         ?                     ^^^^^^^^^^^^^^^^
E                 "files_count": null,
E                 "is_archive": false,
E                 "is_binary": false,
E                 "is_media": false,
E                 "is_script": true,
E                 "is_source": true,
E                 "is_text": true,
E                 "md5": "41ac81497162f2ff48a0442847238ad7",
E         -       "mime_type": "application/javascript",
E         +       "mime_type": "text/plain",
E                 "name": "some file",
E                 "path": "some file",
E                 "programming_language": null,
E                 "scan_errors": [],
E                 "sha1": "5fbba80b758b93a311369979d8a68f22c4817d37",
E                 "size": 38,
E                 "type": "file"
E               },
E               {
E                 "base_name": "some\"file",
E                 "copyrights": [],
E                 "date": "2016-12-21",
E                 "extension": "",
E         -       "file_type": "Node.js script, ASCII text executable",
E         ?                     ^   ---
E         +       "file_type": "a /usr/bin/env node script, ASCII text executable",
E         ?                     ^^^^^^^^^^^^^^^^
E                 "files_count": null,
E                 "is_archive": false,
E                 "is_binary": false,
E                 "is_media": false,
E                 "is_script": true,
E                 "is_source": true,
E                 "is_text": true,
E                 "md5": "9153a386e70bd1713fef91121fb9cbbf",
E         -       "mime_type": "application/javascript",
E         +       "mime_type": "text/plain",
E                 "name": "some\"file",
E                 "path": "some\"file",
E                 "programming_language": null,
E                 "scan_errors": [],
E                 "sha1": "b2016984d073f405f9788fbf6ae270b452ab73b0",
E                 "size": 39,
E                 "type": "file"
E               },
E               {
E                 "base_name": "some\\\"file",
E                 "copyrights": [],
E                 "date": "2016-12-21",
E                 "extension": "",
E                 "file_type": "POSIX shell script, ASCII text executable",
E                 "files_count": null,
E                 "is_archive": false,
E                 "is_binary": false,
E                 "is_media": false,
E                 "is_script": true,
E                 "is_source": true,
E                 "is_text": true,
E                 "md5": "e99c06d03836700154f01778ac782d50",
E                 "mime_type": "text/x-shellscript",
E                 "name": "some\\\"file",
E                 "path": "some/\"file",
E                 "programming_language": "Bash",
E                 "scan_errors": [],
E                 "sha1": "73e029b07257966106d79d35271bf400e3543cea",
E                 "size": 21,
E                 "type": "file"
E               }
E             ],
E             "files_count": 5,
E             "scancode_notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.",
E             "scancode_options": {
E               "--copyright": true,
E               "--format": "json",
E               "--info": true,
E               "--license-score": 0,
E               "--strip-root": true
E             }
E           }

src/scancode/cli_test_utils.py:67: AssertionError
====== 12 failed, 12897 passed, 69 skipped, 58 xfailed in 2539.42 seconds ======
spbkelt commented 7 years ago

Sorry for dumb question. Is it possible to have minimal setup? And yes, I know about package aka long-term solution, which is not available anyway. I create Docker image so i want to keep it as tiny as possible. Or i need some cleanup procedure afterwards. For example cleanup all python test packages at least

Successfully installed apipkg-1.4 bumpversion-0.5.4.dev0 codecov-2.0.9 coverage-4.4.1 execnet-1.4.1 py-1.4.33 pytest-3.1.0 pytest-cov-2.5.1 pytest-xdist-1.16.0 xmltodict-0.11.0

What can be removed from the list?

pombredanne commented 7 years ago

@spbkelt these tests packages are only present in a checkout but not in the release archives. For a trimmed installation without these from a checkout, you could also run ./configure etc/conf .... This will NOT install the development libraries you have listed above

pombredanne commented 7 years ago

So run ./configure --clean and then ./configure etc/conf

pombredanne commented 7 years ago

Of the tests failures, only test_extract_7z_with_relative_path is a tad puzzling ... but it comes from the LZMA dev library not being installed when you built libarchive.

You can ignore the others mostly safely.

spbkelt commented 7 years ago

Please review my install script

git clone https://github.com/nexB/scancode-toolkit.git
git clone https://github.com/nexB/scancode-thirdparty-src.git
pushd scancode-thirdparty-src
./build.sh
popd
# check the diff
pushd scancode-toolkit
git status
./configure --clean && ./configure etc/conf

#install scancode-toolkit
mkdir --parents /usr/share/scancode-toolkit && \
wget --quiet --directory-prefix=/usr/tmp/ https://github.com/nexB/scancode-toolkit/releases/download/v2.2.1/scancode-toolkit-2.2.1.tar.bz2 && \
tar --extract --bzip2 --file=/usr/tmp/scancode-toolkit-2.2.1.tar.bz2 --directory=/usr/share/scancode-toolkit --strip-components=1 && \
rm --force /usr/tmp/scancode-toolkit-2.2.1.tar.bz2
pombredanne commented 7 years ago

@spbkelt what's you goal here? Build a container with a minimal scancode in it?

spbkelt commented 6 years ago

Issue persists. I did install into docker image using script above and it didn't help

/usr/share/scancode-toolkit/scancode --help
/usr/share/scancode-toolkit/scancode --format html-app --diag --timeout 3600 -n 4 --ignore "*.war" --ignore "*.zip" --ignore "*.jar" . oss-report.html
[11:16:43][Step 5/5] * Building license index...
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/etc/conf/base.py", line 59, in <module>
[11:16:43][Step 5/5]     build_license_cache()
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/etc/conf/base.py", line 56, in build_license_cache
[11:16:43][Step 5/5]     cache.reindex()
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/src/licensedcode/cache.py", line 108, in get_or_build_index_through_cache
[11:16:43][Step 5/5]     from licensedcode.index import LicenseIndex
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/src/licensedcode/index.py", line 47, in <module>
[11:16:43][Step 5/5]     from licensedcode import match
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/src/licensedcode/match.py", line 36, in <module>
[11:16:43][Step 5/5]     from licensedcode import query
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/src/licensedcode/query.py", line 32, in <module>
[11:16:43][Step 5/5]     import typecode
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/src/typecode/__init__.py", line 27, in <module>
[11:16:43][Step 5/5]     from typecode.contenttype import get_type
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/src/typecode/contenttype.py", line 47, in <module>
[11:16:43][Step 5/5]     from typecode import magic2
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 221, in <module>
[11:16:43][Step 5/5]     libmagic = load_lib()
[11:16:43][Step 5/5]   File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 214, in load_lib
[11:16:43][Step 5/5]     lib = ctypes.CDLL(magic_so)
[11:16:43][Step 5/5]   File "/usr/local/lib/python2.7/ctypes/__init__.py", line 365, in __init__
[11:16:43][Step 5/5]     self._handle = _dlopen(self._name, mode)
[11:16:43][Step 5/5] OSError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/share/scancode-toolkit/src/typecode/bin/linux-64/lib/libmagic.so)
[11:16:43][Step 5/5] * Activating ...
[11:16:43][Step 5/5] 
[11:16:43][Step 5/5] Failed to execute command:
[11:16:43][Step 5/5] /usr/share/scancode-toolkit/bin/python "/usr/share/scancode-toolkit/etc/conf/base.py". Aborting...
[11:16:44][Step 5/5] Usage: scancode [OPTIONS] <input> <output_file>
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   scan the <input> file or directory for origin clues and license and save
[11:16:44][Step 5/5]   results to the <output_file>.
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   The scan results are printed to stdout if <output_file> is not provided.
[11:16:44][Step 5/5]   Error and progress is printed to stderr.
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5] Options:
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   scans:
[11:16:44][Step 5/5]     -c, --copyright              Scan <input> for copyrights. [default]
[11:16:44][Step 5/5]     -l, --license                Scan <input> for licenses. [default]
[11:16:44][Step 5/5]     -p, --package                Scan <input> for packages. [default]
[11:16:44][Step 5/5]     -e, --email                  Scan <input> for emails.
[11:16:44][Step 5/5]     -u, --url                    Scan <input> for urls.
[11:16:44][Step 5/5]     -i, --info                   Include information such as size, type, etc.
[11:16:44][Step 5/5]     --license-score INTEGER      Do not return license matches with scores lower
[11:16:44][Step 5/5]                                  than this score. A number between 0 and 100.
[11:16:44][Step 5/5]                                  [default: 0]
[11:16:44][Step 5/5]     --license-text               Include the detected licenses matched text. Has
[11:16:44][Step 5/5]                                  no effect unless --license is requested.
[11:16:44][Step 5/5]     --license-url-template TEXT  Set the template URL used for the license
[11:16:44][Step 5/5]                                  reference URLs. In a template URL, curly braces
[11:16:44][Step 5/5]                                  ({}) are replaced by the license key.
[11:16:44][Step 5/5]                                  [default: https://enterprise.dejacode.com/urn/u
[11:16:44][Step 5/5]                                  rn:dje:license:{}]
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   output:
[11:16:44][Step 5/5]     --strip-root           Strip the root directory segment of all paths. The
[11:16:44][Step 5/5]                            default is to always include the last directory
[11:16:44][Step 5/5]                            segment of the scanned path such that all paths have
[11:16:44][Step 5/5]                            a common root directory. This cannot be combined with
[11:16:44][Step 5/5]                            `--full-root` option.
[11:16:44][Step 5/5]     --full-root            Report full, absolute paths. The default is to always
[11:16:44][Step 5/5]                            include the last directory segment of the scanned
[11:16:44][Step 5/5]                            path such that all paths have a common root
[11:16:44][Step 5/5]                            directory. This cannot be combined with the `--strip-
[11:16:44][Step 5/5]                            root` option.
[11:16:44][Step 5/5]     -f, --format <format>  Set <output_file> format to one of: csv, html, html-
[11:16:44][Step 5/5]                            app, json, json-pp, jsonlines, spdx-rdf, spdx-tv or
[11:16:44][Step 5/5]                            use <format> as the path to a custom template file
[11:16:44][Step 5/5]                            [default: json]
[11:16:44][Step 5/5]     --verbose              Print verbose file-by-file progress messages.
[11:16:44][Step 5/5]     --quiet                Do not print summary or progress messages.
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   pre-scan:
[11:16:44][Step 5/5]     --ignore <pattern>  Ignore files matching <pattern>.
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   post-scan:
[11:16:44][Step 5/5]     --mark-source    Set the "is_source" flag to true for directories that
[11:16:44][Step 5/5]                      contain over 90% of source files as direct children. Has no
[11:16:44][Step 5/5]                      effect unless the --info scan is requested.
[11:16:44][Step 5/5]     --only-findings  Only return files or directories with findings for the
[11:16:44][Step 5/5]                      requested scans. Files and directories without findings are
[11:16:44][Step 5/5]                      omitted (not considering basic file information as
[11:16:44][Step 5/5]                      findings).
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   misc:
[11:16:44][Step 5/5]     --reindex-licenses  Force a check and possible reindexing of the cached
[11:16:44][Step 5/5]                         license index.
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   core:
[11:16:44][Step 5/5]     -h, --help               Show this message and exit.
[11:16:44][Step 5/5]     -n, --processes INTEGER  Scan <input> using n parallel processes.  [default:
[11:16:44][Step 5/5]                              1]
[11:16:44][Step 5/5]     --examples               Show command examples and exit.
[11:16:44][Step 5/5]     --about                  Show information about ScanCode and licensing and
[11:16:44][Step 5/5]                              exit.
[11:16:44][Step 5/5]     --version                Show the version and exit.
[11:16:44][Step 5/5]     --diag                   Include additional diagnostic information such as
[11:16:44][Step 5/5]                              error messages or result details.
[11:16:44][Step 5/5]     --timeout FLOAT          Stop scanning a file if scanning takes longer than
[11:16:44][Step 5/5]                              a timeout in seconds.  [default: 120]
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   Examples (use --examples for more):
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   Scan the 'samples' directory for licenses and copyrights.
[11:16:44][Step 5/5]   Save scan results to a JSON file:
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]       scancode --format json samples scancode_result.json
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   Scan the 'samples' directory for licenses and copyrights. Save scan results to
[11:16:44][Step 5/5]   an HTML app file for interactive web browser results navigation. Additional app
[11:16:44][Step 5/5]   files are saved to the 'myscan_files' directory:
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]       scancode --format html-app samples myscan.html
[11:16:44][Step 5/5] 
[11:16:44][Step 5/5]   Note: when you run scancode, a progress bar is displayed with a counter of
[11:16:44][Step 5/5]   the number of files processed. Use --verbose to display file-by-file
[11:16:44][Step 5/5]   progress.
[11:16:45][Step 5/5] Scanning files for: licenses, copyrights, packages with 4 process(es)...
[11:16:45][Step 5/5] Building license detection index...Traceback (most recent call last):
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/bin/scancode", line 11, in <module>
[11:16:45][Step 5/5]     load_entry_point('scancode-toolkit', 'console_scripts', 'scancode')()
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 722, in __call__
[11:16:45][Step 5/5]     return self.main(*args, **kwargs)
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/scancode/utils.py", line 74, in main
[11:16:45][Step 5/5]     standalone_mode=standalone_mode, **extra)
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 697, in main
[11:16:45][Step 5/5]     rv = self.invoke(ctx)
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 895, in invoke
[11:16:45][Step 5/5]     return ctx.invoke(self.callback, **ctx.params)
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 535, in invoke
[11:16:45][Step 5/5]     return callback(*args, **kwargs)
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/decorators.py", line 17, in new_func
[11:16:45][Step 5/5]     return f(get_current_context(), *args, **kwargs)
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/scancode/cli.py", line 490, in scancode
[11:16:45][Step 5/5]     pre_scan_plugins=pre_scan_plugins)
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/scancode/cli.py", line 572, in scan
[11:16:45][Step 5/5]     get_index(False)
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/licensedcode/cache.py", line 188, in get_index
[11:16:45][Step 5/5]     _LICENSES_INDEX = get_or_build_index_through_cache()
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/licensedcode/cache.py", line 108, in get_or_build_index_through_cache
[11:16:45][Step 5/5]     from licensedcode.index import LicenseIndex
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/licensedcode/index.py", line 47, in <module>
[11:16:45][Step 5/5]     from licensedcode import match
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/licensedcode/match.py", line 36, in <module>
[11:16:45][Step 5/5]     from licensedcode import query
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/licensedcode/query.py", line 32, in <module>
[11:16:45][Step 5/5]     import typecode
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/typecode/__init__.py", line 27, in <module>
[11:16:45][Step 5/5]     from typecode.contenttype import get_type
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/typecode/contenttype.py", line 47, in <module>
[11:16:45][Step 5/5]     from typecode import magic2
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 221, in <module>
[11:16:45][Step 5/5]     libmagic = load_lib()
[11:16:45][Step 5/5]   File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 214, in load_lib
[11:16:45][Step 5/5]     lib = ctypes.CDLL(magic_so)
[11:16:45][Step 5/5]   File "/usr/local/lib/python2.7/ctypes/__init__.py", line 365, in __init__
[11:16:45][Step 5/5]     self._handle = _dlopen(self._name, mode)
[11:16:45][Step 5/5] OSError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/share/scancode-toolkit/src/typecode/bin/linux-64/lib/libmagic.so)
[11:16:45][Step 5/5] Process exited with code 1
pombredanne commented 6 years ago

@spbkelt let's craft a proper Dockerfile for this then! Do you have a base file already?

pombredanne commented 6 years ago

Your install script cannot work .... something like this snippet of a Dockerfile may work instead to build s single minimal layer:

# TODO: ensure that all required packages are installed first
RUN echo "Fetching scancode git clones..." && \
    git clone https://github.com/nexB/scancode-toolkit.git && \
    git clone https://github.com/nexB/scancode-thirdparty-src.git && \
    pushd scancode-thirdparty-src && \
    echo "Building new native scancode libraries..." && \
    ./build.sh && \
    popd && \
    pushd scancode-toolkit && \
    echo "Building new release archives..." && \
    etc/release/release.sh && \
    cp dist/scancode-toolkit-2.2.1.tar.bz2 /usr/tmp && \
    popd && \
    echo "Cleanup scancode git clones..." && \
    rm -rf scancode-thirdparty-src scancode-toolkit && \
    echo "Install scancode-toolkit..." && \
    mkdir -p /usr/share/scancode-toolkit && \
    tar -xf /usr/tmp/scancode-toolkit-2.2.1.tar.bz2 -C /usr/share/scancode-toolkit --strip-components=1 && \
    /usr/share/scancode-toolkit/scancode -h &&\
    rm -rf /usr/share/scancode-toolkit/samples &&\
    echo "Cleanup scancode-toolkit archive..." && \
    rm --force /usr/tmp/scancode-toolkit-2.2.1.tar.bz2 && \
    echo "scancode-toolkit build completed!"

See also https://github.com/fabric8-analytics/fabric8-analytics-worker-base/blob/master/Dockerfile that may be of some help

And @roscopecoltran has built some Dockerfiles on alpine too in https://github.com/roscopecoltran/sniperkit-services which may be of some help too.

spbkelt commented 6 years ago

Finally it works in docker container! Thanks for help @pombredanne

pombredanne commented 6 years ago

@spbkelt Thanks.... Glad it finally works: feel free to close this ticket then

pombredanne commented 6 years ago

Since all works, I am closing now.