Open dithwick opened 3 years ago
@dithwick Can you run this with eb --trace
, so we see for which extension the problem is popping up?
Fixing this should be done in the framework BTW, this doesn't seem to be specific to TensorFlow at all, since the TensorFlow easyblock is not popping up in the traceback...
@boegel Yeah sure. Does that work with the upload test report option or should I just do it manually?
Not sure if this is quite what you were after but https://gist.github.com/dithwick/9e0a69e6271ee4474cce0ee9f1588b31
Hi,
I've just had this same problem again but with PyTorch-1.7.1-fosscuda-2020b.eb and at the sanity check stage this time:
== building and installing MPI/GCC-CUDA/10.2.0-11.1.1/OpenMPI/4.0.5/PyTorch/1.7.1...
== fetching files...
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== testing...
== installing...
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
ERROR: Traceback (most recent call last):
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/main.py", line 117, in build_and_install_software
(ec_res['success'], app_log, err) = build_and_install_one(ec, init_env)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 3633, in build_and_install_one
result = app.run_all_steps(run_test_cases=run_test_cases)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 3531, in run_all_steps
self.run_step(step_name, step_methods)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 3386, in run_step
step_method(self)()
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/easyblocks/p/pytorch.py", line 263, in sanity_check_step
super(EB_PyTorch, self).sanity_check_step(*args, **kwargs)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/easyblocks/generic/pythonpackage.py", line 836, in sanity_check_step
fake_mod_data = self.load_fake_module(purge=True)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 1561, in load_fake_module
fake_mod_path = self.make_module_step(fake=True)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 3162, in make_module_step
txt += self.make_module_dep()
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 1181, in make_module_dep
full_mod_subdir, all_deps)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/modules.py", line 1121, in path_to_top_of_module_tree
modpath_exts = dict([(k, v) for k, v in self.modpath_extensions_for(deps).items() if v])
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/modules.py", line 1053, in modpath_extensions_for
modtxt = self.read_module_file(mod_name)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/modules.py", line 983, in read_module_file
modfilepath = self.modulefile_path(mod_name)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/modules.py", line 748, in modulefile_path
modpath = self.get_value_from_modulefile(mod_name, modpath_re)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/modules.py", line 726, in get_value_from_modulefile
if self.exist([mod_name], skip_avail=True)[0]:
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/modules.py", line 627, in exist
mod_exists = mod_exists_via_show(mod_name)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/modules.py", line 563, in mod_exists_via_show
stderr = self.show(mod_name)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/modules.py", line 711, in show
ans = self.run_module('show', mod_name, check_output=False, return_stderr=True)
File "/scrtp/avon/eb/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/modules.py", line 824, in run_module
(stdout, stderr) = proc.communicate()
File "/usr/lib64/python3.6/subprocess.py", line 863, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/usr/lib64/python3.6/subprocess.py", line 1578, in _communicate
self.stderr.errors)
File "/usr/lib64/python3.6/subprocess.py", line 760, in _translate_newlines
data = data.decode(encoding, errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 549: ordinal not in range(128)
which looks to me like it is failing at the same place. I assume it will build fine with LC_ALL=
, I need the module now anyway so I'll test and report back.
Tensorflow again (this time 2.5.0), I'll post the error output here in case it helps identify where the problem is cropping up:
== processing EasyBuild easyconfig /sulis/easybuild/software/EasyBuild/4.4.1/easybuild/easyconfigs/t/TensorFlow/TensorFlow-2.5.0-foss-2020b.eb
== building and installing MPI/GCC/10.2.0/OpenMPI/4.0.5/TensorFlow/2.5.0...
== fetching files...
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== ... (took 6 secs)
== configuring...
== building...
== testing...
== installing...
== taking care of extensions...
== ... (took 5 secs)
ERROR: Traceback (most recent call last):
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/main.py", line 118, in build_and_install_software
(ec_res['success'], app_log, err) = build_and_install_one(ec, init_env)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 3691, in build_and_install_one
result = app.run_all_steps(run_test_cases=run_test_cases)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 3582, in run_all_steps
self.run_step(step_name, step_methods)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 3435, in run_step
step_method(self)()
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/easyblocks/generic/pythonbundle.py", line 128, in extensions_step
super(PythonBundle, self).extensions_step(*args, **kwargs)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 2379, in extensions_step
fake_mod_data = self.load_fake_module(purge=True, extra_modules=build_dep_mods)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 1567, in load_fake_module
fake_mod_path = self.make_module_step(fake=True)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 3205, in make_module_step
txt += self.make_module_dep()
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/framework/easyblock.py", line 1187, in make_module_dep
full_mod_subdir, all_deps)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/modules.py", line 1121, in path_to_top_of_module_tree
modpath_exts = dict([(k, v) for k, v in self.modpath_extensions_for(deps).items() if v])
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/modules.py", line 1053, in modpath_extensions_for
modtxt = self.read_module_file(mod_name)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/modules.py", line 983, in read_module_file
modfilepath = self.modulefile_path(mod_name)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/modules.py", line 748, in modulefile_path
modpath = self.get_value_from_modulefile(mod_name, modpath_re)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/modules.py", line 726, in get_value_from_modulefile
if self.exist([mod_name], skip_avail=True)[0]:
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/modules.py", line 627, in exist
mod_exists = mod_exists_via_show(mod_name)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/modules.py", line 563, in mod_exists_via_show
stderr = self.show(mod_name)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/modules.py", line 711, in show
ans = self.run_module('show', mod_name, check_output=False, return_stderr=True)
File "/sulis/easybuild/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/modules.py", line 824, in run_module
(stdout, stderr) = proc.communicate()
File "/usr/lib64/python3.6/subprocess.py", line 863, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/usr/lib64/python3.6/subprocess.py", line 1578, in _communicate
self.stderr.errors)
File "/usr/lib64/python3.6/subprocess.py", line 760, in _translate_newlines
data = data.decode(encoding, errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 593: ordinal not in range(128)
This is on CentOS 8.3 with LC_ALL=POSIX. When running:
$ eb TensorFlow-2.4.1-fosscuda-2020b.eb -r
I getThe error goes if I run with
LC_ALL= $ eb TensorFlow-2.4.1-fosscuda-2020b.eb -r
. I've had this problem before such as issue #2393, however I updated the numpy easyblock in that case following advice from @boegel. For this particular bug I'm not sure where the issue is occurring.