Open pbisbal1 opened 8 months ago
As a workaround, I've done this which seems to work. In etc/spack/packages.yaml I added these lines to make intel-oneapi-mkl the default for blas and lapack:
packages:
all:
providers:
blas: [intel-oneapi-mkl, amdblis]
lapack: [intel-oneapi-mkl, amdlibflame]
This works in my test environment (the spack.yaml shown above). I haven't tested in my production environment yet. In addition to providing a usable workaround, this also seems to confirm that the problem is in using 'intel-oneapi-mkl' in a where:
statement.
Since this is an AMD-based cluster, I'd prefer being able to make amdblis/amdlibflame the defaults and make intel-oneapi-mkl the exception when compiling with oneapi.
Prentice
Correction to that workaround... The concretizer seems to concretize everything when using that workaround, but things are NOT being concretized as desired. hpl%oneapi is being concretized with amdblis as the blas provider instead of intel-oneapi-mkl.
If I run spack install, the installation doesn't complete, but no errors are shown.
If you update to https://github.com/spack/spack/commit/e78484f501178cb71be363a78762c237240816aa, the errors will no longer be silent
(that commit won't actually make it so the concretization succeeds, but it will prevent the failure from being silent)
https://github.com/spack/spack/issues/43475 (once a PR is created for it) should make use cases like this easier.
Steps to reproduce
I'm trying to have spack use intel-oneapi-mkl as the provider for BLAS and LAPACK when the compiler is oneapi, and amdblis and amdlibflame when using using aocc or gcc. With @becker33's help I arrived at the following solution: (only BLAS section shown same arrangement for LAPACK, too):
When I do
spack concretize -f
, the concretizer appears to complete w/o error, but when you look at the packages concretized, a number of them are missing. If I runspack install
, the installation doesn't complete, but no errors are shown. I just notice that if I installed say, 100 packages, the last package installed will say something like [77/100], indicating that spack stopped installing before all 100 packages are installed, but there's no obvious error messages. If I comment out the lines above pertaining to intel-oneapi like this:The concretize/install process works as expected, but the packages compiled with %oneapi aren't using the desired BLAS provider (openblas is used instead, most likely because that's the default provider in etc/spack/defaults/packages.yaml) Here is a minimal spack.yaml I'm using to reproduce the problem:
Error message
Without using
spack -d
, there are no obvious error messages from the concretizer. The only way to see the an error is to look at the concretizer output and see what packages were concretized. When I run the concretizer with-d
I see these messages:But I see similar errors even when I comment out the %oneapi lines, so I don't think they're related to this problem. I've attached the output of running
spack -d concretizer -f
in both cases for you to look at: concretizer_debug_output_w_oneapi.txt concretizer_debug_output_wo_oneapi.txtInformation on your system
General information
spack debug report
and reported the version of Spack/Python/Platform