NCAR / spack-gust

Spack production user software stack on the Gust test system
4 stars 0 forks source link

Request installation of nvhpc/22.2 on Gust #14

Closed sjsprecious closed 1 year ago

sjsprecious commented 2 years ago

I encountered an internal compiler error when using nvhpc/22.7 to compile my GPU code on Gust.

The nvhpc/21.3 compiler does not support AMD Zen 3 architecture.

Is it possible to install nvhpc/22.2 on Gust, which I know works with my code on Casper?

jedwards4b commented 2 years ago

Can you provide information about the error to open a ticket with nvhpc?
Also the release notes include a number of changes to default falg settings - I wonder if setting those to the previous values would solve the issue?

sjsprecious commented 2 years ago

Thanks @jedwards4b. I actually observed a similar error when using nvhpc/22.5 on Casper and thought nvhpc/22.7 could solve it, but it did not. That is why I would like to try nvhpc/22.2 on Gust since I know it works on Casper at least.

I will report this bug to NVIDIA team later.

vanderwb commented 2 years ago

Thanks for doing that Jian. I would guess it will not get fixed very quickly otherwise unless it is very common.

I will likely need to build a new software stack and switch over to it this weekend, so I will look at getting it on there either with that new stack of installs or shortly after.

sjsprecious commented 2 years ago

Thanks @vanderwb. That sounds good!

vanderwb commented 1 year ago

Hi @sjsprecious. nvhpc/22.2 is available in the new software stack. It is not default as of yet, but you can switch to it as follows:

module load ncarenv/22.08b
module load nvhpc/22.2
sjsprecious commented 1 year ago

Hi @vanderwb, thanks for your help and I am able to build my GPU code on Gust with nvhpc/22.2.

However, I got a runtime error when I used nvhpc/22.2 and tried to use two GPU nodes. Is it related to the MPI/GPU/OS issues mentioned at #12?

johnmauff commented 1 year ago

@sjsprecious Thanks for trying the new nvhpc/22.2 compiler install, it is unfortunate that we are still not able to utilize the GPUs on Gust. @vanderwb, to the best of my knowledge we have no workaround at the moment that would enable the CM1 ASD project to proceed.

vanderwb commented 1 year ago

Understood @johnmauff. Once Jared gets the compute node OS updated (likely during today's maintenance), we will ping you to try again.

vanderwb commented 1 year ago

Closing since this install is completed.