Closed spbkelt closed 6 years ago
Thanks for the report and sorry for this issue! There is similar issue at #463 and may be #443 It comes from the shameful and terrible fact that the codebase contains pre-built binaries "(here you are stumbling on libmagic) and these may not have been built with old enough version of Linux. See also #469
It may solvable with some links per @akaihola in #443 but I am not sure.
It starts working correctly if I run this:
sudo ln -s /usr/lib64/libbz2.so.1 /usr/lib64/libbz2.so.1.0
Is there anything special about your centos 6.6 installation? e.g. If I spin a VM or a container, this is a vanilla one?
The culprits shared objects and binaries are in your case:
Technically these were built exactly from https://github.com/nexB/scancode-thirdparty-src
So the process to solve all this mess is
Is there anything special about your centos 6.6 installation? e.g. If I spin a VM or a container, this is a vanilla one?
We have centos:6.6 as base image for our container with build agent where we run scancode. So we have a lot of software there installed. Not pure vanilla definitely.
469 which is the right and long term solution
From #469 i got that you still don't have RPM/DEB. So we don't have such solution
short term, rebuilding on your OS the three libraries above and replace the ScanCode ones.
Could you please provide actual build scripts? And how to replace/bundle everything after that ?
You wrote:
Could you please provide actual build scripts? And how to replace/bundle everything after that
I hate this! but I created this mess in the first place so this is the least I could do for you. It should be straight ./configure && make
... and then copy the bits in the rights places.... But let me craft this for you :)
But let me craft this for you :)
Awesome. It's your mess so ...please :)
And why did you mention those libraries above? I have error about libmagic.so
We need them all for things to work. Here is the thing
git clone https://github.com/nexB/scancode-toolkit.git
git clone https://github.com/nexB/scancode-thirdparty-src.git
pushd scancode-thirdparty-src
./build.sh
popd
# check the diff
pushd scancode-toolkit
git status
# run the tests on 4 CPUs
./configure --clean && ./configure
bin/py.test -vvs -n4
It is OK if test_scan_can_handle_weird_file_names
fails.
note these build scripts are also prep work needed for #469 :) .... so this is timely
I have these tests failured
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_copr3_correct
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_copr2_correct
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_germany_should_detect_trailing_city
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_in_html_comments
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_in_windows_binary_lib
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_java
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_copr5_correct
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_in_dll_exact
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_in_html_incorrect
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_json_phps_html
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_license__qpl_v1_0_perfect
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_license_text_doc
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_url_in_html
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_json_in_phps
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_piersol
reason:
XFAIL tests/cluecode/test_copyrights.py::TestCopyrightDetection::test_copyright_license_text_scilab
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_dbus_dbus_dbus_sha_c_trail_name
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_chromium_chrome_common_extensions_docs_examples_apps_hello_python_httplib2_init_py_extra_contributors
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_guava_guava_ipr_markup
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_kernel_headers_original_linux_cdrom_h_trail_email
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_lohit_fonts_notice_trail_url
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_bluetooth_bluez_audio_gateway_c_trail_name
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_srec_tools_grxmlcompile_grxmlcompile_cpp
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_quake_quake_src_qw_client_menu_c_trail_name
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_tcpdump_print_rx_c_trail_name
reason:
XFAIL tests/cluecode/test_finder.py::TestUrl::test_misc_valid_unicode_or_punycode_urls_that_should_pass
reason:
XFAIL tests/cluecode/test_holders.py::TestHolders::test_holder_multiline
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_tcpdump_ieee802_11_h_trail_email
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_tcpdump_print_snmp_c_trail_name_lead_name_trail_name_complex
reason:
XFAIL tests/extractcode/test_extract.py::TestExtract::test_extract_directory_of_windows_ar_archives
reason:
XFAIL tests/extractcode/test_extract.py::TestExtract::test_extract_with_kinds
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_freetype_src_base_ftbase_h_trail_name
reason:
XFAIL tests/extractcode/test_patch.py::TestPatchInfoFailing::test_patch_info_patch_patches_misc_webkit_opensource_patches_sync_xhr_patch
reason:
XFAIL tests/extractcode/test_patch.py::TestPatchInfoFailing::test_patch_info_patch_patches_problematic_opensso_patch
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_kernel_headers_original_linux_netfilter_xt_connmark_h_trail_url
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_libvpx_examples_includes_geshi_docs_geshi_doc_txt_trail_email_trail_url_misc
reason:
XFAIL tests/cluecode/test_copyrights_ics.py::TestCopyright::test_ics_lohit_fonts_lohit_bengali_ttf_copyright_trail_url
reason:
XFAIL tests/cluecode/test_finder.py::TestUrl::test_misc_invalid_urls_that_crash
reason:
XFAIL tests/cluecode/test_finder.py::TestUrl::test_misc_valid_urls_that_should_pass
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_d_zlib_and_gfdl_1_2_and_gpl_and_gpl_and_other_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_eclipse_openj9_html_html
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_flt9_gif
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_gpl_2_0_and_lppl_1_3c_and_public_domain_1_copyright
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_gpl_2_0_and_lppl_1_3c_and_public_domain_label
reason:
XFAIL tests/typecode/test_contenttype.py::TestContentType::test_build_makefile_inc_is_not_povray
reason:
XFAIL tests/typecode/test_contenttype.py::TestContentType::test_code_groff
reason:
XFAIL tests/typecode/test_contenttype.py::TestContentType::test_filetype_file_on_unicode_file_name2
reason:
XFAIL tests/typecode/test_contenttype.py::TestContentType::test_text_rsync_file_is_not_octet_stream
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_aes_128_3_0_and_bsd_new_and_bsd_original_uc_and_bsd_simplified_and_other_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_apache_1_1_and_apache_2_0_and_cpl_1_0_and_epl_1_0_and_other_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_apache_2_0_and_apache_2_0_and_bsd_new_and_gpl_and_other_1_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_apache_2_0_and_apache_2_0_and_bsd_new_and_gpl_and_other_2_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_apache_2_0_and_apache_2_0_and_bsd_new_and_gpl_and_other_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_boost_1_0_and_bsd_simplified_and_cddl_1_0_and_gpl_2_0_classpath_and_other_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_bsd_new_and_bsd_new_and_bsd_new_and_bsd_new_and_other_1_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_bsd_new_and_bsd_simplified_and_lgpl_and_lgpl_2_0_plus_and_other_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_bsd_new_and_gpl_2_0_and_gpl_3_0_and_public_domain_and_other_txt
reason:
XFAIL tests/licensedcode/test_detection_datadriven.py::TestLicenseDataDriven::test_detection_bsd_original_and_bsd_simplified_and_mit_and_mit_and_other_txt
reason:
=================================== FAILURES ===================================
___________ TestPermissions.test_copyfile_does_not_keep_permissions ____________
[gw1] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_copyfile_does_not_keep_permissions>
def test_copyfile_does_not_keep_permissions(self):
src_file = self.get_temp_file()
dest = self.get_temp_dir()
with open(src_file, 'wb') as f:
f.write('')
try:
make_non_readable(src_file)
if on_posix:
> assert not filetype.is_readable(src_file)
E AssertionError: assert not True
E + where True = <function is_readable at 0x7feb75141c80>('/tmp/scancode_root/tst/ UhH1AU/6_pAoB/td/tf.txt')
E + where <function is_readable at 0x7feb75141c80> = filetype.is_readable
tests/commoncode/test_fileutils.py:156: AssertionError
__________________ TestPermissions.test_chmod_read_write_file __________________
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_chmod_read_write_file>
def test_chmod_read_write_file(self):
test_dir = self.get_test_loc('fileutils/executable', copy=True)
test_file = join(test_dir, 'deep1', 'deep2', 'ctags')
try:
make_non_writable(test_file)
> assert not filetype.is_writable(test_file)
E AssertionError: assert not True
E + where True = <function is_writable at 0x390ccf8>('/tmp/scancode_root/tst/ dMZXPe/OX2wOp/executable/deep1/deep2/ctags')
E + where <function is_writable at 0x390ccf8> = filetype.is_writable
tests/commoncode/test_fileutils.py:122: AssertionError
____________ TestPermissions.test_copytree_copies_unreadable_files _____________
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_copytree_copies_unreadable_files>
def test_copytree_copies_unreadable_files(self):
src = self.get_test_loc('fileutils/exec', copy=True)
dst = self.get_temp_dir()
src_file1 = join(src, 'a.bat')
src_file2 = join(src, 'subtxt', 'a.txt')
try:
# make some unreadable source files
make_non_readable(src_file1)
if on_posix:
> assert not filetype.is_readable(src_file1)
E AssertionError: assert not True
E + where True = <function is_readable at 0x390cc80>('/tmp/scancode_root/tst/ dMZXPe/Z_L8Rs/exec/a.bat')
E + where <function is_readable at 0x390cc80> = filetype.is_readable
tests/commoncode/test_fileutils.py:204: AssertionError
_______________ TestSevenZip.test_extract_7z_with_relative_path ________________
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_archive.TestSevenZip testMethod=test_extract_7z_with_relative_path>
def test_extract_7z_with_relative_path(self):
test_file = self.get_test_loc('archive/7z/7zip_relative.7z')
test_dir = self.get_temp_dir()
result = archive.extract_7z(test_file, test_dir)
non_result = os.path.join(test_dir, '../a_parent_folder.txt')
assert not os.path.exists(non_result)
assert [] == result
extracted = self.collect_extracted_path(test_dir)
expected = [
'/dotdot/',
'/dotdot/2folder/',
'/dotdot/2folder/3folder/',
'/dotdot/2folder/3folder/relative_file',
'/dotdot/2folder/3folder/relative_file~',
'/dotdot/2folder/relative_file',
'/dotdot/relative_file'
]
> assert expected == extracted
E AssertionError: assert ['/dotdot/', ...ve_file', ...] == []
E Left contains more items, first extra item: '/dotdot/'
E Full diff:
E + []
E - [u'/dotdot/',
E - u'/dotdot/2folder/',
E - u'/dotdot/2folder/3folder/',
E - u'/dotdot/2folder/3folder/relative_file',
E - u'/dotdot/2folder/3folder/relative_file~',
E - u'/dotdot/2folder/relative_file',
E - u'/dotdot/relative_file']
tests/extractcode/test_archive.py:1682: AssertionError
TestExtractArchiveWithIllegalFilenamesWithLibarchiveOnLinux.test_extract_7zip_with_weird_filenames_with_libarchive
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_archive.TestExtractArchiveWithIllegalFilenamesWithLibarchiveOnLinux testMethod=test_extract_7zip_with_weird_filenames_with_libarchive>
def test_extract_7zip_with_weird_filenames_with_libarchive(self):
test_file = self.get_test_loc('archive/weird_names/weird_names.7z')
> self.check_extract(libarchive2.extract, test_file, expected_warnings=[], expected_suffix='libarch')
tests/extractcode/test_archive.py:2145:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/extractcode/test_archive.py:2109: in check_extract
warnings = test_function(test_file, test_dir)
src/extractcode/libarchive2.py:144: in extract
for entry in list_entries(abs_location):
src/extractcode/libarchive2.py:169: in list_entries
for entry in archive:
src/extractcode/libarchive2.py:243: in iter
r = next_entry(self.archive_struct, entry_struct)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
rc = -30, archive_func = <_FuncPtr object at 0x41d9600>
args = (101088368, 112322176), null = False
def errcheck(rc, archive_func, args, null=False):
"""
ctypes error check handler for functions returning int, or null if null is True.
"""
if null:
if rc is None:
archive_struct = args and len(args) > 1 and args[0] or None
raise ArchiveError(rc, archive_struct, archive_func)
else:
return rc
if rc >= ARCHIVE_OK:
return rc
archive_struct = args[0]
if rc == ARCHIVE_RETRY:
raise ArchiveErrorRetryable(rc, archive_struct, archive_func)
if rc == ARCHIVE_WARN:
raise ArchiveWarning(rc, archive_struct, archive_func)
# anything else is a serious error, in general not recoverable.
> raise ArchiveError(rc, archive_struct, archive_func)
E ArchiveError: LZMA codec is unsupported
src/extractcode/libarchive2.py:453: ArchiveError
TestExtractArchiveWithIllegalFilenamesWithLibarchiveOnLinuxWarnings.test_extract_7zip_with_weird_filenames_with_libarchive
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_archive.TestExtractArchiveWithIllegalFilenamesWithLibarchiveOnLinuxWarnings testMethod=test_extract_7zip_with_weird_filenames_with_libarchive>
def test_extract_7zip_with_weird_filenames_with_libarchive(self):
test_file = self.get_test_loc('archive/weird_names/weird_names.7z')
> self.check_extract(libarchive2.extract, test_file, expected_warnings=[], expected_suffix='libarch')
tests/extractcode/test_archive.py:2145:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/extractcode/test_archive.py:2109: in check_extract
warnings = test_function(test_file, test_dir)
src/extractcode/libarchive2.py:144: in extract
for entry in list_entries(abs_location):
src/extractcode/libarchive2.py:169: in list_entries
for entry in archive:
src/extractcode/libarchive2.py:243: in iter
r = next_entry(self.archive_struct, entry_struct)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
rc = -30, archive_func = <_FuncPtr object at 0x41d9600>
args = (93179216, 97408624), null = False
def errcheck(rc, archive_func, args, null=False):
"""
ctypes error check handler for functions returning int, or null if null is True.
"""
if null:
if rc is None:
archive_struct = args and len(args) > 1 and args[0] or None
raise ArchiveError(rc, archive_struct, archive_func)
else:
return rc
if rc >= ARCHIVE_OK:
return rc
archive_struct = args[0]
if rc == ARCHIVE_RETRY:
raise ArchiveErrorRetryable(rc, archive_struct, archive_func)
if rc == ARCHIVE_WARN:
raise ArchiveWarning(rc, archive_struct, archive_func)
# anything else is a serious error, in general not recoverable.
> raise ArchiveError(rc, archive_struct, archive_func)
E ArchiveError: LZMA codec is unsupported
src/extractcode/libarchive2.py:453: ArchiveError
__________________ TypeTest.test_is_readable_is_writeable_dir __________________
[gw2] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_filetype.TypeTest testMethod=test_is_readable_is_writeable_dir>
def test_is_readable_is_writeable_dir(self):
base_dir = self.get_test_loc('filetype/readwrite', copy=True)
test_dir = os.path.join(base_dir, 'sub')
try:
assert filetype.is_readable(test_dir)
assert filetype.is_writable(test_dir)
make_non_readable(test_dir)
if on_posix:
> assert not filetype.is_readable(test_dir)
E AssertionError: assert not True
E + where True = <function is_readable at 0x7f1ba0244c80>('/tmp/scancode_root/tst/ wZKyXy/nth2d7/readwrite/sub')
E + where <function is_readable at 0x7f1ba0244c80> = filetype.is_readable
tests/commoncode/test_filetype.py:111: AssertionError
_________ TestPermissions.test_chmod_read_write_non_recursively_on_dir _________
[gw2] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_chmod_read_write_non_recursively_on_dir>
def test_chmod_read_write_non_recursively_on_dir(self):
test_dir = self.get_test_loc('fileutils/executable', copy=True)
test_file = join(test_dir, 'deep1', 'deep2', 'ctags')
test_dir = join(test_dir, 'deep1', 'deep2')
parent = join(test_dir, 'deep1')
try:
# setup
make_non_writable(test_file)
> assert not filetype.is_writable(test_file)
E AssertionError: assert not True
E + where True = <function is_writable at 0x7f1ba0244cf8>('/tmp/scancode_root/tst/ wZKyXy/lXWRvj/executable/deep1/deep2/ctags')
E + where <function is_writable at 0x7f1ba0244cf8> = filetype.is_writable
tests/commoncode/test_fileutils.py:95: AssertionError
_____ TestPermissions.test_copytree_does_not_keep_non_writable_permissions _____
[gw2] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_copytree_does_not_keep_non_writable_permissions>
def test_copytree_does_not_keep_non_writable_permissions(self):
src = self.get_test_loc('fileutils/exec', copy=True)
dst = self.get_temp_dir()
try:
src_file = join(src, 'subtxt/a.txt')
make_non_writable(src_file)
> assert not filetype.is_writable(src_file)
E AssertionError: assert not True
E + where True = <function is_writable at 0x7f1ba0244cf8>('/tmp/scancode_root/tst/ wZKyXy/OnUCnq/exec/subtxt/a.txt')
E + where <function is_writable at 0x7f1ba0244cf8> = filetype.is_writable
tests/commoncode/test_fileutils.py:172: AssertionError
_________________ TypeTest.test_is_readable_is_writeable_file __________________
[gw3] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_filetype.TypeTest testMethod=test_is_readable_is_writeable_file>
def test_is_readable_is_writeable_file(self):
base_dir = self.get_test_loc('filetype/readwrite', copy=True)
test_file = os.path.join(os.path.join(base_dir, 'sub'), 'file')
try:
assert filetype.is_readable(test_file)
assert filetype.is_writable(test_file)
make_non_readable(test_file)
if on_posix:
> assert not filetype.is_readable(test_file)
E AssertionError: assert not True
E + where True = <function is_readable at 0x2e9ec80>('/tmp/scancode_root/tst/ zmhaL0/NQuSjT/readwrite/sub/file')
E + where <function is_readable at 0x2e9ec80> = filetype.is_readable
tests/commoncode/test_filetype.py:94: AssertionError
___________ TestPermissions.test_chmod_read_write_recursively_on_dir ___________
[gw3] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
self = <test_fileutils.TestPermissions testMethod=test_chmod_read_write_recursively_on_dir>
def test_chmod_read_write_recursively_on_dir(self):
test_dir = self.get_test_loc('fileutils/executable', copy=True)
test_file = join(test_dir, 'deep1', 'deep2', 'ctags')
test_dir2 = join(test_dir, 'deep1', 'deep2')
parent = join(test_dir, 'deep1')
try:
make_non_writable(test_file)
> assert not filetype.is_writable(test_file)
E AssertionError: assert not True
E + where True = <function is_writable at 0x2e9ecf8>('/tmp/scancode_root/tst/ zmhaL0/3jv4ZH/executable/deep1/deep2/ctags')
E + where <function is_writable at 0x2e9ecf8> = filetype.is_writable
tests/commoncode/test_fileutils.py:65: AssertionError
____________________ test_scan_can_handle_weird_file_names _____________________
[gw0] linux2 -- Python 2.7.6 /opt/Python-2.7.6/scancode-toolkit/bin/python2.7
@skipIf(on_windows, 'This test cannot run on windows as these are not legal file names.')
def test_scan_can_handle_weird_file_names():
test_dir = test_env.extract_test_tar('weird_file_name/weird_file_name.tar.gz')
result_file = test_env.get_temp_file('json')
result = run_scan_click(['-c', '-i', '--strip-root', test_dir, result_file])
assert result.exit_code == 0
assert "KeyError: 'sha1'" not in result.output
assert 'Scanning done' in result.output
# Some info vary on each OS
# See https://github.com/nexB/scancode-toolkit/issues/438 for details
if on_linux:
expected = 'weird_file_name/expected-linux.json'
elif on_mac:
expected = 'weird_file_name/expected-mac.json'
else:
raise Exception('Not a supported OS?')
> check_json_scan(test_env.get_test_loc(expected), result_file, regen=False)
tests/scancode/test_cli.py:581:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
expected_file = '/opt/Python-2.7.6/scancode-toolkit/tests/scancode/data/weird_file_name/expected-linux.json'
result_file = '/tmp/scancode_root/tst/ dMZXPe/Ftcjfc/td/tf.json', regen = False
strip_dates = False
def check_json_scan(expected_file, result_file, regen=False, strip_dates=False):
"""
Check the scan result_file JSON results against the expected_file expected JSON
results. Removes references to test_dir for the comparison. If regen is True the
expected_file WILL BE overwritten with the results. This is convenient for
updating tests expectations. But use with caution.
"""
result = _load_json_result(result_file)
if strip_dates:
remove_dates(result)
if regen:
with open(expected_file, 'wb') as reg:
json.dump(result, reg, indent=2, separators=(',', ': '))
expected = _load_json_result(expected_file)
if strip_dates:
remove_dates(expected)
# NOTE we redump the JSON as a string for a more efficient comparison of
# failures
expected = json.dumps(expected, indent=2, sort_keys=True, separators=(',', ': '))
result = json.dumps(result, indent=2, sort_keys=True, separators=(',', ': '))
> assert expected == result
E assert '{\n "files":...": true\n }\n}' == '{\n "files": ...": true\n }\n}'
E {
E "files": [
E {
E "base_name": "some 'file",
E "copyrights": [],
E "date": "2016-12-21",
E "extension": "",
E "file_type": "POSIX shell script, ASCII text executable",
E "files_count": null,
E "is_archive": false,
E "is_binary": false,
E "is_media": false,
E "is_script": true,
E "is_source": true,
E "is_text": true,
E "md5": "62c4cdf80d860c09f215ffff0a9ed020",
E "mime_type": "text/x-shellscript",
E "name": "some 'file",
E "path": "some 'file",
E "programming_language": "Bash",
E "scan_errors": [],
E "sha1": "715037088f2582f3fbb7e9492f819987f713a332",
E "size": 20,
E "type": "file"
E },
E {
E "base_name": "some \\file",
E "copyrights": [],
E "date": "2016-12-21",
E "extension": "",
E "file_type": "POSIX shell script, ASCII text executable",
E "files_count": null,
E "is_archive": false,
E "is_binary": false,
E "is_media": false,
E "is_script": true,
E "is_source": true,
E "is_text": true,
E "md5": "e99c06d03836700154f01778ac782d50",
E "mime_type": "text/x-shellscript",
E "name": "some \\file",
E "path": "some /file",
E "programming_language": "Bash",
E "scan_errors": [],
E "sha1": "73e029b07257966106d79d35271bf400e3543cea",
E "size": 21,
E "type": "file"
E },
E {
E "base_name": "some file",
E "copyrights": [],
E "date": "2016-12-21",
E "extension": "",
E - "file_type": "Node.js script, ASCII text executable",
E ? ^ ---
E + "file_type": "a /usr/bin/env node script, ASCII text executable",
E ? ^^^^^^^^^^^^^^^^
E "files_count": null,
E "is_archive": false,
E "is_binary": false,
E "is_media": false,
E "is_script": true,
E "is_source": true,
E "is_text": true,
E "md5": "41ac81497162f2ff48a0442847238ad7",
E - "mime_type": "application/javascript",
E + "mime_type": "text/plain",
E "name": "some file",
E "path": "some file",
E "programming_language": null,
E "scan_errors": [],
E "sha1": "5fbba80b758b93a311369979d8a68f22c4817d37",
E "size": 38,
E "type": "file"
E },
E {
E "base_name": "some\"file",
E "copyrights": [],
E "date": "2016-12-21",
E "extension": "",
E - "file_type": "Node.js script, ASCII text executable",
E ? ^ ---
E + "file_type": "a /usr/bin/env node script, ASCII text executable",
E ? ^^^^^^^^^^^^^^^^
E "files_count": null,
E "is_archive": false,
E "is_binary": false,
E "is_media": false,
E "is_script": true,
E "is_source": true,
E "is_text": true,
E "md5": "9153a386e70bd1713fef91121fb9cbbf",
E - "mime_type": "application/javascript",
E + "mime_type": "text/plain",
E "name": "some\"file",
E "path": "some\"file",
E "programming_language": null,
E "scan_errors": [],
E "sha1": "b2016984d073f405f9788fbf6ae270b452ab73b0",
E "size": 39,
E "type": "file"
E },
E {
E "base_name": "some\\\"file",
E "copyrights": [],
E "date": "2016-12-21",
E "extension": "",
E "file_type": "POSIX shell script, ASCII text executable",
E "files_count": null,
E "is_archive": false,
E "is_binary": false,
E "is_media": false,
E "is_script": true,
E "is_source": true,
E "is_text": true,
E "md5": "e99c06d03836700154f01778ac782d50",
E "mime_type": "text/x-shellscript",
E "name": "some\\\"file",
E "path": "some/\"file",
E "programming_language": "Bash",
E "scan_errors": [],
E "sha1": "73e029b07257966106d79d35271bf400e3543cea",
E "size": 21,
E "type": "file"
E }
E ],
E "files_count": 5,
E "scancode_notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.",
E "scancode_options": {
E "--copyright": true,
E "--format": "json",
E "--info": true,
E "--license-score": 0,
E "--strip-root": true
E }
E }
src/scancode/cli_test_utils.py:67: AssertionError
====== 12 failed, 12897 passed, 69 skipped, 58 xfailed in 2539.42 seconds ======
Sorry for dumb question. Is it possible to have minimal setup? And yes, I know about package aka long-term solution, which is not available anyway. I create Docker image so i want to keep it as tiny as possible. Or i need some cleanup procedure afterwards. For example cleanup all python test packages at least
Successfully installed apipkg-1.4 bumpversion-0.5.4.dev0 codecov-2.0.9 coverage-4.4.1 execnet-1.4.1 py-1.4.33 pytest-3.1.0 pytest-cov-2.5.1 pytest-xdist-1.16.0 xmltodict-0.11.0
What can be removed from the list?
@spbkelt these tests packages are only present in a checkout but not in the release archives. For a trimmed installation without these from a checkout, you could also run ./configure etc/conf .... This will NOT install the development libraries you have listed above
So run ./configure --clean
and then ./configure etc/conf
Of the tests failures, only test_extract_7z_with_relative_path
is a tad puzzling ... but it comes from the LZMA dev library not being installed when you built libarchive
.
You can ignore the others mostly safely.
Please review my install script
git clone https://github.com/nexB/scancode-toolkit.git
git clone https://github.com/nexB/scancode-thirdparty-src.git
pushd scancode-thirdparty-src
./build.sh
popd
# check the diff
pushd scancode-toolkit
git status
./configure --clean && ./configure etc/conf
#install scancode-toolkit
mkdir --parents /usr/share/scancode-toolkit && \
wget --quiet --directory-prefix=/usr/tmp/ https://github.com/nexB/scancode-toolkit/releases/download/v2.2.1/scancode-toolkit-2.2.1.tar.bz2 && \
tar --extract --bzip2 --file=/usr/tmp/scancode-toolkit-2.2.1.tar.bz2 --directory=/usr/share/scancode-toolkit --strip-components=1 && \
rm --force /usr/tmp/scancode-toolkit-2.2.1.tar.bz2
@spbkelt what's you goal here? Build a container with a minimal scancode in it?
Issue persists. I did install into docker image using script above and it didn't help
/usr/share/scancode-toolkit/scancode --help
/usr/share/scancode-toolkit/scancode --format html-app --diag --timeout 3600 -n 4 --ignore "*.war" --ignore "*.zip" --ignore "*.jar" . oss-report.html
[11:16:43][Step 5/5] * Building license index...
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/etc/conf/base.py", line 59, in <module>
[11:16:43][Step 5/5] build_license_cache()
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/etc/conf/base.py", line 56, in build_license_cache
[11:16:43][Step 5/5] cache.reindex()
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/src/licensedcode/cache.py", line 108, in get_or_build_index_through_cache
[11:16:43][Step 5/5] from licensedcode.index import LicenseIndex
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/src/licensedcode/index.py", line 47, in <module>
[11:16:43][Step 5/5] from licensedcode import match
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/src/licensedcode/match.py", line 36, in <module>
[11:16:43][Step 5/5] from licensedcode import query
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/src/licensedcode/query.py", line 32, in <module>
[11:16:43][Step 5/5] import typecode
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/src/typecode/__init__.py", line 27, in <module>
[11:16:43][Step 5/5] from typecode.contenttype import get_type
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/src/typecode/contenttype.py", line 47, in <module>
[11:16:43][Step 5/5] from typecode import magic2
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 221, in <module>
[11:16:43][Step 5/5] libmagic = load_lib()
[11:16:43][Step 5/5] File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 214, in load_lib
[11:16:43][Step 5/5] lib = ctypes.CDLL(magic_so)
[11:16:43][Step 5/5] File "/usr/local/lib/python2.7/ctypes/__init__.py", line 365, in __init__
[11:16:43][Step 5/5] self._handle = _dlopen(self._name, mode)
[11:16:43][Step 5/5] OSError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/share/scancode-toolkit/src/typecode/bin/linux-64/lib/libmagic.so)
[11:16:43][Step 5/5] * Activating ...
[11:16:43][Step 5/5]
[11:16:43][Step 5/5] Failed to execute command:
[11:16:43][Step 5/5] /usr/share/scancode-toolkit/bin/python "/usr/share/scancode-toolkit/etc/conf/base.py". Aborting...
[11:16:44][Step 5/5] Usage: scancode [OPTIONS] <input> <output_file>
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] scan the <input> file or directory for origin clues and license and save
[11:16:44][Step 5/5] results to the <output_file>.
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] The scan results are printed to stdout if <output_file> is not provided.
[11:16:44][Step 5/5] Error and progress is printed to stderr.
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] Options:
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] scans:
[11:16:44][Step 5/5] -c, --copyright Scan <input> for copyrights. [default]
[11:16:44][Step 5/5] -l, --license Scan <input> for licenses. [default]
[11:16:44][Step 5/5] -p, --package Scan <input> for packages. [default]
[11:16:44][Step 5/5] -e, --email Scan <input> for emails.
[11:16:44][Step 5/5] -u, --url Scan <input> for urls.
[11:16:44][Step 5/5] -i, --info Include information such as size, type, etc.
[11:16:44][Step 5/5] --license-score INTEGER Do not return license matches with scores lower
[11:16:44][Step 5/5] than this score. A number between 0 and 100.
[11:16:44][Step 5/5] [default: 0]
[11:16:44][Step 5/5] --license-text Include the detected licenses matched text. Has
[11:16:44][Step 5/5] no effect unless --license is requested.
[11:16:44][Step 5/5] --license-url-template TEXT Set the template URL used for the license
[11:16:44][Step 5/5] reference URLs. In a template URL, curly braces
[11:16:44][Step 5/5] ({}) are replaced by the license key.
[11:16:44][Step 5/5] [default: https://enterprise.dejacode.com/urn/u
[11:16:44][Step 5/5] rn:dje:license:{}]
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] output:
[11:16:44][Step 5/5] --strip-root Strip the root directory segment of all paths. The
[11:16:44][Step 5/5] default is to always include the last directory
[11:16:44][Step 5/5] segment of the scanned path such that all paths have
[11:16:44][Step 5/5] a common root directory. This cannot be combined with
[11:16:44][Step 5/5] `--full-root` option.
[11:16:44][Step 5/5] --full-root Report full, absolute paths. The default is to always
[11:16:44][Step 5/5] include the last directory segment of the scanned
[11:16:44][Step 5/5] path such that all paths have a common root
[11:16:44][Step 5/5] directory. This cannot be combined with the `--strip-
[11:16:44][Step 5/5] root` option.
[11:16:44][Step 5/5] -f, --format <format> Set <output_file> format to one of: csv, html, html-
[11:16:44][Step 5/5] app, json, json-pp, jsonlines, spdx-rdf, spdx-tv or
[11:16:44][Step 5/5] use <format> as the path to a custom template file
[11:16:44][Step 5/5] [default: json]
[11:16:44][Step 5/5] --verbose Print verbose file-by-file progress messages.
[11:16:44][Step 5/5] --quiet Do not print summary or progress messages.
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] pre-scan:
[11:16:44][Step 5/5] --ignore <pattern> Ignore files matching <pattern>.
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] post-scan:
[11:16:44][Step 5/5] --mark-source Set the "is_source" flag to true for directories that
[11:16:44][Step 5/5] contain over 90% of source files as direct children. Has no
[11:16:44][Step 5/5] effect unless the --info scan is requested.
[11:16:44][Step 5/5] --only-findings Only return files or directories with findings for the
[11:16:44][Step 5/5] requested scans. Files and directories without findings are
[11:16:44][Step 5/5] omitted (not considering basic file information as
[11:16:44][Step 5/5] findings).
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] misc:
[11:16:44][Step 5/5] --reindex-licenses Force a check and possible reindexing of the cached
[11:16:44][Step 5/5] license index.
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] core:
[11:16:44][Step 5/5] -h, --help Show this message and exit.
[11:16:44][Step 5/5] -n, --processes INTEGER Scan <input> using n parallel processes. [default:
[11:16:44][Step 5/5] 1]
[11:16:44][Step 5/5] --examples Show command examples and exit.
[11:16:44][Step 5/5] --about Show information about ScanCode and licensing and
[11:16:44][Step 5/5] exit.
[11:16:44][Step 5/5] --version Show the version and exit.
[11:16:44][Step 5/5] --diag Include additional diagnostic information such as
[11:16:44][Step 5/5] error messages or result details.
[11:16:44][Step 5/5] --timeout FLOAT Stop scanning a file if scanning takes longer than
[11:16:44][Step 5/5] a timeout in seconds. [default: 120]
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] Examples (use --examples for more):
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] Scan the 'samples' directory for licenses and copyrights.
[11:16:44][Step 5/5] Save scan results to a JSON file:
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] scancode --format json samples scancode_result.json
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] Scan the 'samples' directory for licenses and copyrights. Save scan results to
[11:16:44][Step 5/5] an HTML app file for interactive web browser results navigation. Additional app
[11:16:44][Step 5/5] files are saved to the 'myscan_files' directory:
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] scancode --format html-app samples myscan.html
[11:16:44][Step 5/5]
[11:16:44][Step 5/5] Note: when you run scancode, a progress bar is displayed with a counter of
[11:16:44][Step 5/5] the number of files processed. Use --verbose to display file-by-file
[11:16:44][Step 5/5] progress.
[11:16:45][Step 5/5] Scanning files for: licenses, copyrights, packages with 4 process(es)...
[11:16:45][Step 5/5] Building license detection index...Traceback (most recent call last):
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/bin/scancode", line 11, in <module>
[11:16:45][Step 5/5] load_entry_point('scancode-toolkit', 'console_scripts', 'scancode')()
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 722, in __call__
[11:16:45][Step 5/5] return self.main(*args, **kwargs)
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/scancode/utils.py", line 74, in main
[11:16:45][Step 5/5] standalone_mode=standalone_mode, **extra)
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 697, in main
[11:16:45][Step 5/5] rv = self.invoke(ctx)
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 895, in invoke
[11:16:45][Step 5/5] return ctx.invoke(self.callback, **ctx.params)
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/core.py", line 535, in invoke
[11:16:45][Step 5/5] return callback(*args, **kwargs)
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/lib/python2.7/site-packages/click/decorators.py", line 17, in new_func
[11:16:45][Step 5/5] return f(get_current_context(), *args, **kwargs)
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/scancode/cli.py", line 490, in scancode
[11:16:45][Step 5/5] pre_scan_plugins=pre_scan_plugins)
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/scancode/cli.py", line 572, in scan
[11:16:45][Step 5/5] get_index(False)
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/licensedcode/cache.py", line 188, in get_index
[11:16:45][Step 5/5] _LICENSES_INDEX = get_or_build_index_through_cache()
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/licensedcode/cache.py", line 108, in get_or_build_index_through_cache
[11:16:45][Step 5/5] from licensedcode.index import LicenseIndex
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/licensedcode/index.py", line 47, in <module>
[11:16:45][Step 5/5] from licensedcode import match
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/licensedcode/match.py", line 36, in <module>
[11:16:45][Step 5/5] from licensedcode import query
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/licensedcode/query.py", line 32, in <module>
[11:16:45][Step 5/5] import typecode
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/typecode/__init__.py", line 27, in <module>
[11:16:45][Step 5/5] from typecode.contenttype import get_type
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/typecode/contenttype.py", line 47, in <module>
[11:16:45][Step 5/5] from typecode import magic2
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 221, in <module>
[11:16:45][Step 5/5] libmagic = load_lib()
[11:16:45][Step 5/5] File "/usr/share/scancode-toolkit/src/typecode/magic2.py", line 214, in load_lib
[11:16:45][Step 5/5] lib = ctypes.CDLL(magic_so)
[11:16:45][Step 5/5] File "/usr/local/lib/python2.7/ctypes/__init__.py", line 365, in __init__
[11:16:45][Step 5/5] self._handle = _dlopen(self._name, mode)
[11:16:45][Step 5/5] OSError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/share/scancode-toolkit/src/typecode/bin/linux-64/lib/libmagic.so)
[11:16:45][Step 5/5] Process exited with code 1
@spbkelt let's craft a proper Dockerfile for this then! Do you have a base file already?
Your install script cannot work .... something like this snippet of a Dockerfile may work instead to build s single minimal layer:
# TODO: ensure that all required packages are installed first
RUN echo "Fetching scancode git clones..." && \
git clone https://github.com/nexB/scancode-toolkit.git && \
git clone https://github.com/nexB/scancode-thirdparty-src.git && \
pushd scancode-thirdparty-src && \
echo "Building new native scancode libraries..." && \
./build.sh && \
popd && \
pushd scancode-toolkit && \
echo "Building new release archives..." && \
etc/release/release.sh && \
cp dist/scancode-toolkit-2.2.1.tar.bz2 /usr/tmp && \
popd && \
echo "Cleanup scancode git clones..." && \
rm -rf scancode-thirdparty-src scancode-toolkit && \
echo "Install scancode-toolkit..." && \
mkdir -p /usr/share/scancode-toolkit && \
tar -xf /usr/tmp/scancode-toolkit-2.2.1.tar.bz2 -C /usr/share/scancode-toolkit --strip-components=1 && \
/usr/share/scancode-toolkit/scancode -h &&\
rm -rf /usr/share/scancode-toolkit/samples &&\
echo "Cleanup scancode-toolkit archive..." && \
rm --force /usr/tmp/scancode-toolkit-2.2.1.tar.bz2 && \
echo "scancode-toolkit build completed!"
See also https://github.com/fabric8-analytics/fabric8-analytics-worker-base/blob/master/Dockerfile that may be of some help
And @roscopecoltran has built some Dockerfiles on alpine too in https://github.com/roscopecoltran/sniperkit-services which may be of some help too.
Finally it works in docker container! Thanks for help @pombredanne
@spbkelt Thanks.... Glad it finally works: feel free to close this ticket then
Since all works, I am closing now.
GLIBC version is:
Tried to install mnaually
Still no luck...