E3SM-Project / polaris

Testing and analysis for OMEGA, MPAS-Ocean, MALI and MPAS-Seaice
BSD 3-Clause "New" or "Revised" License
7 stars 13 forks source link

Fix deployment on unknown machines #214

Closed xylar closed 3 months ago

xylar commented 3 months ago

This merge updates versions of ESMF, ParallelIO and PNetCDF for conda environments. These were needed for compatibility with the required version of MPAS-Tools. (It won't affect spack deployments until we update to 1.5.0-alpha.1.) As a result, the Polaris alpha version has been bumped to 2.

This merge also fixes a check for a software spack environment in the bootstrap stage of deployment. The check is only performed if we are on a "known" machine.

GitHub Actions has been updated to deploy a full conda environment including compilers and MPI support. This should catch incompatibilities like emerged between MPAS-Tools and other dependencies specified in deploy/default.cfg.

Checklist

fixes #213

xylar commented 3 months ago

Testing

I was able to deploy on my Linux laptop (an "unknown" machine) with this fix, whereas deployment failed with the error in #213 without these changed.

xylar commented 3 months ago

@altheaden, could you test with your Mac using this branch? If you run into the progressbar error again, could you post an issue about that?

altheaden commented 3 months ago

On the mac, I am still having an error, although it's progressing further than it did for me before these changes. The error I am seeing is this:

creating polaris-test

 Running:
   source /Users/althea/miniconda3/etc/profile.d/conda.sh
   conda activate
   conda create -y -n polaris-test --override-channels -c conda-forge -c defaults -c e3sm/label/polaris --file spec-file-mpich.txt python=3.11

Channels:
 - conda-forge
 - defaults
 - e3sm/label/polaris
Platform: osx-arm64
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - moab==5.5.1[build=*_tempest_*]

Current channels:

  - https://conda.anaconda.org/conda-forge
  - defaults
  - https://conda.anaconda.org/e3sm/label/polaris

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

Traceback (most recent call last):
  File "/Users/althea/lanl/polaris/main/deploy/bootstrap.py", line 1251, in <module>
    main()
  File "/Users/althea/lanl/polaris/main/deploy/bootstrap.py", line 1154, in main
    build_conda_env(
  File "/Users/althea/lanl/polaris/main/deploy/bootstrap.py", line 318, in build_conda_env
    check_call(commands, logger=logger)
  File "/Users/althea/lanl/polaris/main/deploy/shared.py", line 153, in check_call
    raise subprocess.CalledProcessError(process.returncode, commands)
subprocess.CalledProcessError: Command 'source /Users/althea/miniconda3/etc/profile.d/conda.sh && conda activate && conda create -y -n polaris-test --override-channels -c conda-forge -c defaults -c e3sm/label/polaris --file spec-file-mpich.txt python=3.11' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/Users/althea/lanl/polaris/main/./configure_polaris_envs.py", line 133, in <module>
    main()
  File "/Users/althea/lanl/polaris/main/./configure_polaris_envs.py", line 84, in main
    _bootstrap(activate_install_env, source_path, local_conda_build)
  File "/Users/althea/lanl/polaris/main/./configure_polaris_envs.py", line 129, in _bootstrap
    check_call(command)
  File "/Users/althea/lanl/polaris/main/deploy/shared.py", line 153, in check_call
    raise subprocess.CalledProcessError(process.returncode, commands)
subprocess.CalledProcessError: Command 'source /Users/althea/miniconda3/etc/profile.d/conda.sh && conda activate polaris_bootstrap && /Users/althea/lanl/polaris/main/deploy/bootstrap.py --conda /Users/althea/miniconda3 -c clang -i mpich --env_name polaris-test --recreate -f example.cfg --verbose' returned non-zero exit status 1.
altheaden commented 3 months ago

It looks like my mac error is relating to arm64 macs, as described here, which is not something that we are going to prioritize for polaris right now. @xylar do you want me to do any additional testing for this PR?

xylar commented 3 months ago

@altheaden, no I don't think it makes sense for you to spend any more time on this since it isn't working on either of your macs. We can come back to this an add osx-arm64 support down the road but not today. I'll finish testing on my mac and call it good.

xylar commented 3 months ago

OSX Testing

I also was able to run the deployment script successfully on OSX with the latest changes.

xylar commented 3 months ago

Thanks, @altheaden!