Closed dmargala closed 4 years ago
The fastest runtime is in the 20ms range, is the plan to have multiple patches? I'm happier when I see the fastest implementation have enough work to take a few seconds. Maybe that's a separate test?
Thanks.
@rcthomas A typical full frame extraction with this patch size would require about ~4700 patches. The current benchmark for a full frame extraction with this patch size on cori haswell node with 32 mpi ranks is around 1 min.
@lastephey Those are good points. Currently, the only entry point to the cupy enabled functions is through the test suite which will have a problem if it's run on non-gpu node in environment with cupy and numbda.cuda. I can add a simple test can be added to the try-except import block to solve that problem.
This PR does not change the default behavior of the code. A developer would have to explicitly import a function from the from the gpu_specter.extract.gpu
module or from the gpu_specter.extract.both
module and pass arguments of the appropriate array type. I have a hard time imagining a developer would ever want to completely replace numpy with cupy in heterogenous computing environment so I agree that we would not want to do that.
I think the idea is that a user would pass the --gpu
argument to spex
to specify they want to use the gpu version (like how the --mpi
argument works).
@sbailey I think my previous reply addresses most of your points. Sorry I didn't see it before I posted.
Regarding CPU/GPU tests, yes, I forgot to mention that they are all passing.
Setup gpu test environment:
ssh cori.nersc.gov
cd desi/gpu_specter
git checkout implement-gpu-extraction
module load esslurm python cuda
source activate desi-gpu
export PYTHONPATH=$(pwd)/py:$PYTHONPATH
salloc -C gpu -N 1 -t 30 -c 10 -G 1 -A m1759
Run the test suite:
dmargala@cgpu11:~/desi/gpu_specter> srun python -m unittest --verbose gpu_specter.test.test_suite
test_basics (gpu_specter.test.test_ex2d_patch.TestEx2dPatch) ... ok
test_compare_specter (gpu_specter.test.test_ex2d_patch.TestEx2dPatch) ... skipped 'specter not available'
test_compare_xp_cpu (gpu_specter.test.test_ex2d_patch.TestEx2dPatch) ... ok
test_compare_xp_gpu (gpu_specter.test.test_ex2d_patch.TestEx2dPatch) ... ok
test_basics (gpu_specter.test.test_projection_matrix.TestProjectionMatrix) ... ok
test_compare_gpu (gpu_specter.test.test_projection_matrix.TestProjectionMatrix) ... ok
test_compare_specter (gpu_specter.test.test_projection_matrix.TestProjectionMatrix) ... skipped 'specter not available'
test_basics (gpu_specter.test.test_psfcoeff.TestPSFCoeff) ... ok
test_compare_gpu (gpu_specter.test.test_psfcoeff.TestPSFCoeff) ... ok
test_compare_specter (gpu_specter.test.test_psfcoeff.TestPSFCoeff) ... skipped 'specter not available'
test_gpu_basics (gpu_specter.test.test_psfcoeff.TestPSFCoeff) ... ok
test_basics (gpu_specter.test.test_spots.TestPSFSpots) ... ok
test_compare_gpu (gpu_specter.test.test_spots.TestPSFSpots) ... ok
test_compare_specter (gpu_specter.test.test_spots.TestPSFSpots) ... skipped 'specter not available'
----------------------------------------------------------------------
Ran 14 tests in 4.087s
OK (skipped=4)
I also the test suite in an environment with specter and confirmed that those tests still pass as well.
It looks like there is a cupy.is_available()
function that returns True
for me on a gpu node and False
on non-gpu node using same conda environment with cupy. I'll update the try-except blocks in the test suite to use this.
And just to clarify, the presence of the cupy/numba.cuda runs additional tests. I implemented the gpu tests in the same fashion as the specter comparison tests.
@sbailey To answer your question about a non-cupy way to check for the GPU, the best thing I can think of is the command nvidia-smi
. It's a little messy but this will tell you if you have access to an (NVIDIA) gpu.
There could be something like:
import subprocess
subprocess.run(["nvidia-smi"])
For a GPU the output is:
Fri May 29 10:59:24 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:1A:00.0 Off | 0 |
| N/A 29C P0 39W / 300W | 0MiB / 16160MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
And without a GPU the output is:
Traceback (most recent call last):
File "is_there_gpu.py", line 3, in <module>
subprocess.run(["nvidia-smi"])
File "/usr/common/software/python/3.7-anaconda-2019.10/lib/python3.7/subprocess.py", line 472, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/common/software/python/3.7-anaconda-2019.10/lib/python3.7/subprocess.py", line 775, in __init__
restore_signals, start_new_session)
File "/usr/common/software/python/3.7-anaconda-2019.10/lib/python3.7/subprocess.py", line 1522, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'nvidia-smi': 'nvidia-smi'
It's not "clean" and would require a little work to parse, but it should return the presence of a GPU independent of framework (with the caveat that we have done module load cuda
.)
This PR adds unit tests that compare the cpu and gpu versions of the functions implemented in
gpu_specter.extract
. All of the gpu functions ingpu_specter.extract.gpu
have been updated to match the interface and functionality of their versions ingpu_specter.extract.cpu
.There are just a few functions left that do not have a gpu version, notably,
gpu_specter.extract.cpu.ex2d_padded
andnumpy.polynomial.legendre.legval
.The new
gpu_specter.extract.both.xp_ex2d_patch
is a version ofgpu_specter.extract.cpu.ex2d_patch
that is compatible with inputs from eithernumpy.ndarray
orcupy.ndarray
.The figure below compares the runtime for a single patch using the numba optimized
gpu_specter.extract.cpu.ex2d_patch
, the newxp_ex2d_patch
withcupy.ndarray
inputs, and the newxp_ex2d_patch
withnumpy.ndarray
inputs. The projection matrix shape for this patch corresponds toA4.shape = (ny, nx, nspec, nwave) = (102, 46, 5, 50)
. The benchmark was performed in a jupyter notebook on a cori shared gpu node.