fangohr / octopus-in-spack

Develop Octopus in spack (software packaging)
BSD 3-Clause "New" or "Revised" License
3 stars 4 forks source link

Failure of pipeline (october 2023) #96

Closed fangohr closed 5 months ago

fangohr commented 10 months ago

Using this issue to gather debug notes

fangohr commented 10 months ago

Turns out this is originating from the netcdf dependency. Somehow the way we write it in the package does not work well with the new concretizer:

    with when("+mpi"):  # list all the parallel dependencies
        depends_on("netcdf-fortran ^netcdf-c+mpi", when="+netcdf")

I can work around this from the command line by using

spack spec octopus +mpi+parmetis+arpack+cgal+pfft+pnfft+python+likwid+libyaml+elpa+nlopt+etsf-io+sparskit+berkeleygw+nfft~debug~cuda~metis~scalapack^netcdf-fortran^netcdf-c+mpi 

instead of

spack spec octopus +mpi+parmetis+arpack+cgal+pfft+pnfft+python+likwid+libyaml+elpa+nlopt+etsf-io+sparskit+berkeleygw+nfft~debug~cuda~metis~scalapack+netcdf

I will try to change the Dockerfile in this repo to use the above fix. This seems to work locally (in the Dockercontainer from this repo).

What really should do though (new issue?) is to improve the spack package file for Octopus. This may need the help of the spack team.

[Out of time here for now, leaving the notes to make the restart later easier.]

iamashwin99 commented 10 months ago

I can confirm that the spec fails with a not so helpful error message. I'm curious however bout how you debugged that it was the netcdf one causing the error.

==> [2023-10-13-16:58:23.941671] Error: Spack concretizer internal error. Please submit a bug report and include the command, environment if applicable and the following error message.                                            octopus+arpack+berkeleygw+cgal~cuda~debug+elpa+etsf-io+libyaml+likwid~metis+mpi+netcdf+nfft+nlopt+parmetis+pfft+pnfft+python~scalapack+sparskit is unsatisfiable, errors are:                                               Traceback (most recent call last):                        File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/cmd/__init__.py", line 218, in parse_specs            spec.concretize(tests=tests)  # implies normalize     File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/spec.py", line 2967, in concretize
    self._new_concretize(tests)                           File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/spec.py", line 2940, in _new_concretize               result.raise_if_unsat()
  File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/solver/asp.py", line 506, in raise_if_unsat
    raise InternalConcretizerError(constraints, conflict
s=conflicts)                                            spack.solver.asp.InternalConcretizerError: Spack concretizer internal error. Please submit a bug report and incl
ude the command, environment if applicable and the following error message.                                         octopus+arpack+berkeleygw+cgal~cuda~debug+elpa+etsf-io+libyaml+likwid~metis+mpi+netcdf+nfft+nlopt+parmetis+pfft+pnfft+python~scalapack+sparskit is unsatisfiable, errors are:                                                                                                       The above exception was the direct cause of the following exception:                                                                                                    Traceback (most recent call last):
  File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/main.py", line 1022, in main
    return _main(argv)                                    File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/main.py", line 977, in _main                          return finish_parse_and_run(parser, cmd_name, env_format_error)                                               File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/main.py", line 1005, in finish_parse_and_run          return _invoke_command(command, parser, args, unknown)                                                        File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/main.py", line 646, in _invoke_command                return_val = command(parser, args)
  File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/cmd/spec.py", line 101, in spec                       concretized_specs = spack.cmd.parse_specs(args.specs, concretize=True)
  File "/scratch/karnada/spackbox/spacklatest/lib/spack/spack/cmd/__init__.py", line 232, in parse_specs
    raise spack.error.SpackError(msg) from e
spack.error.SpackError: Spack concretizer internal error. Please submit a bug report and include the command, environment if applicable and the following error message.
    octopus+arpack+berkeleygw+cgal~cuda~debug+elpa+etsf-io+libyaml+likwid~metis+mpi+netcdf+nfft+nlopt+parmetis+pfft+pnfft+python~scalapack+sparskit is unsatisfiable, errors are:
fangohr commented 10 months ago

How I found out: It fails at computing the spec. So I tested the spec with all the variants (which included netcdf), and it fails. Then I tested without any variants, and it can compute the spec (let's call that a 'pass'). I then removed one of the variants after another, and attempted to compute the spec for all. Until I removed the one that created a problem (netcdf). I deleted variants from the back - of course netcdf was last. With a bit more focus one could have used a bisection approach and reduce the number of attempts to compute the spec.

I did a bit more testing to see if perhaps a combination of packages was creating the problem but it didn't look like it. Once I looked into the package.py file it seems likely that netcdf caused the problem because the dependency specification was different from all others.

iamashwin99 commented 10 months ago

Ah! Thats the most obvious way but extremely tedious. Im wondering if in the CI we can somehow ask spack to test all possible combinations of variants (or atleast a subset ) in a simple way (less code). This shouldn't take quite long as the specs are cached and incrementally adding variants should perhaps not take longer than specing altogether ( or at least should be comparable)

fangohr commented 10 months ago

Can you approve it (or review it)? Would be good to request this change upstream before the next spack release ;-).

fangohr commented 10 months ago

No need for a more systematic test: this kind of failure is unlikely, and the existing CI told us that something was broken (so it worked).