Open xinshengqin opened 4 years ago
Updated. How does this look like?
Setting
CUDA_VER ?= $(CUDA_VERSION)
means to set CUDA_VER
to the value of the environment variable CUDA_VERSION
only if CUDA_VER
is itself not set as an environment variable. This doesn't seem like what you want since the user wouldn't have two such environment variables.
Maybe you want to first set CUDA_VERSION
to some particular version if you are on certain machines and then the above command would over-ride this if the user set an environment variable (which hopefully they did if they are on a different machine than any listed).
And then always do
ALL_FFLAGS += -Mcuda=$(CUDA_VER),maxregcount:80 -Mcuda=lineinfo -Mcuda=rdc
so that the value set as CUDA_VER
is always used regardless of what machine.
Is there not a canonical way to do this? I looked a bit and it looks like you can run nvcc --version
as mentioned here.
Is there not a canonical way to do this? I looked a bit and it looks like you can run
nvcc --version
as mentioned here.
One can do nvcc --version to get something like
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
But then you have to parse these strings to get what you want. In this case it's "10.0".
Also the cuda compiler does not define an environment variable like CUDA_VERSION for us.
What @rjleveque suggested looks like a simple approach with the caveat that user will need to set this in their bashrc file.
What option do you guys think we should use, 1) parse the string return by "nvcc --version"; 2) use environment variable set by the user?
I don't think we should try to parse the version string ourselves. Setting a variable with CUDA_VERSION ?=
seems like the best approach to me, setting it to some default value (maybe dependent on machine although that's not useful to most users). And then the user can either set an environment variable to over-ride this, or just set CUDA_VERSION = ...
in their own application Makefile which would also over-ride the default set in Makefile.common.gpu
, I believe.
One way or another I think the user is responsible for setting this and we should make it easy for them to do so in their own Makefile
and/or as an environment variable without having to modify Makefile.common.gpu
.
Also it looks like there are some other things on the same ALL_FLAGS line as cuda10.1 that are machine-dependent, e.g. cc35 vs cc60? What is this and does it also need to be an environment variable?
Sounds good. I will go ahead to use this approach. cc35 and cc60 are "compute capabilities". Those can be interpreted as version numbers that indicate what features are supported. Different generations of GPU has different compute capabilities.
More details can be seen at: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities
I updated the makefile to use CUDA_VERSION and CUDA_CC.
CUDA_VERSION should be set to the version "nvcc --version" returns. CUDA_CC should be set to the compute capability supported by the machine you compile this code on.
The users are expected to set the two environment variables. They will be passed to the compiler as compilation flags: ALL_FFLAGS += -Mcuda=cuda$(CUDA_VERSION),$(CUDA_CC)
I would be a bit cautious about adding this to something like Makefile.common
even thought it is in a “GPU” version. Why not add those variables to the local makefile?
Maybe we should have the application point to a new common Makefile for the GPU, e.g.
CLAWMAKE = $(CLAW)/clawutil/src/Makefile.common.gpu
so there could be a common one for gpu applications without messing with the standard one.
Or maybe better to just have in the application Makefile as @mandli suggests, which might make it easier for the user to see exactly what's being set and modify it appropriately.
You mentioned two options for this case: 1) have the application point to a new common Makefile for the GPU. Users need to set variables like CUDA_PATH in their bash environment. 2) have user set variables like CUDA_PATH in their local Makefile.
For option 1, users don't have to set it multiple times for every example. Option 2 is easier for the user to see what's being set or what they need to set.
I can't come up with other reasonings here to compare the two. But this is an important design decision to make.
Any thoughts on which of the two option we should choose?
I tend to use the CUDA_PATH ?= default
in the local Makefiles for times like this where you can provide a default value but also override it in the environment.
Just updated it to use variables set in local Makefiles. Relevant changes are also made in: https://github.com/clawpack/amrclaw/pull/259 https://github.com/clawpack/geoclaw/pull/438
Rather than hardwiring in different CUDA versions for different computer names, would it be better to have a variable
CUDA_VERSION
or some such thing that a user could set as an environment variable? Most potential users won't have access to doppio or other machines listed in the Makefile.