EESSI / software-layer

Software layer of the EESSI project
https://eessi.github.io/docs/software_layer
GNU General Public License v2.0
23 stars 47 forks source link

CI issues for Archspec release 0.2.3 #483

Open ocaisa opened 7 months ago

ocaisa commented 7 months ago

Release 0.2.3 of archspec causes a failure in our CI:

Traceback (most recent call last):
  File "./eessi_software_subdir.py", line 7, in <module>
    from archspec.cpu.detect import compatible_microarchitectures, raw_info_dictionary
ImportError: cannot import name 'raw_info_dictionary' from 'archspec.cpu.detect' (/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/archspec/cpu/detect.py)

This is because raw_info_dictionary has been replaced with a better alternative in archspec. @alalazo suggested the replacement

from archspec.cpu import host

VENDOR_MAP = {
    'GenuineIntel': 'intel',
    'AuthenticAMD': 'amd',
}

def det_host_triple():
    """                                                                                                                                                                                                                                                                          
    Determine host triple: (<cpu_family>, <cpu_vendor>, <cpu_name>).                                                                                                                                                                                                             
    <cpu_vendor> may be None if there's no match in VENDOR_MAP.                                                                                                                                                                                                                  
    """
    host_cpu = host()
    host_vendor = VENDOR_MAP.get(host_cpu.vendor)
    host_cpu_family = host_cpu.family.name
    host_cpu_name = host_cpu.name
    return host_cpu_family, host_vendor, host_cpu_name

which should work with old and new archspec.

When looking into the replacement I noticed we have very overlapping code in https://github.com/EESSI/software-layer/blob/2023.06-software.eessi.io/init/eessi_software_subdir_for_host.py and https://github.com/EESSI/software-layer/blob/2023.06-software.eessi.io/eessi_software_subdir.py and I suspect that latter should be removed/replaced.

ocaisa commented 7 months ago

So, it seems we are still using archspec via the ancient https://github.com/EESSI/software-layer/blob/2023.06-software.eessi.io/eessi_software_subdir.py to determine the software subdirectory for the build host (https://github.com/EESSI/software-layer/blob/2023.06-software.eessi.io/EESSI-install-software.sh#L137)

ocaisa commented 7 months ago

(A temporary fix for the underlying issue is merged in #482 , the updated release only affects CI as we ship an older version in EESSI)

ocaisa commented 7 months ago

So, it's not entirely clear to me which behaviour we want:

{EESSI 2023.06} ocaisa@LAPTOP-O6HF2IKC:~/software-layer$ python ./eessi_software_subdir.py
x86_64/intel/icelake
{EESSI 2023.06} ocaisa@LAPTOP-O6HF2IKC:~/software-layer$ $EESSI_INIT_DIR_PATH/eessi_archdetect.sh  cpupath
x86_64/intel/skylake_avx512
{EESSI 2023.06} ocaisa@LAPTOP-O6HF2IKC:~/software-layer$ python $EESSI_INIT_DIR_PATH/eessi_software_subdir_for_host.py $EESSI_PREFIX
x86_64/intel/skylake_avx512

So right now, our install script works because we make sure that we run on a machine that returns the value we expect. To me it looks like this is the right behaviour right now, we want the strictness of archspec in this scenario so that we don't accidentally build on a more capable node and end up with executables that only work on a subset of architectures. At runtime, the use of archdetect is ok as we are mapping from what they have to what we can deliver.

ocaisa commented 7 months ago

PR to fix all this is in #485