issues
search
cwpearson
/
stencil
A prototype MPI/CUDA stencil communication library
Boost Software License 1.0
10
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
DevicePacker doesn't pack current data after swap when using cudaGraph API
#30
cwpearson
opened
3 years ago
0
Support non-pitched allocations
#29
cwpearson
opened
3 years ago
1
Create LICENSE
#28
cwpearson
closed
3 years ago
0
MPI_Isend in remote sender can be very delayed
#27
cwpearson
closed
4 years ago
2
USE_CUDA_GRAPH causes test_cuda failures on cuda/10.1.243
#26
cwpearson
closed
4 years ago
1
STENCIL_SETUP_STATS should track different MPI and CUDA copy sizes
#25
cwpearson
opened
4 years ago
0
Handling CUDA_VISIBLE_DEVICES
#24
cwpearson
opened
4 years ago
0
STENCIL_SETUP_STATS should record same-GPU, same-node, and off-node comm totals
#23
cwpearson
opened
4 years ago
0
CMake 3.18.1 policy warning with unset CMAKE_CUDA_ARCHITECTURES
#22
cwpearson
opened
4 years ago
0
4-rank run under `cuda-memcheck` on css-host-yz-20 can fail with `Program hit cudaErrorPeerAccessAlreadyEnabled (error 704) due to "peer access is already enabled" on CUDA API call to cudaDeviceEnablePeerAccess.`
#21
cwpearson
closed
4 years ago
1
Self-exchange with `CudaMpi | CudaMpiColocated` happens with `CudaMpi` instead of `CudaMpiColocated`
#20
cwpearson
opened
4 years ago
0
Test fails if not started with cudaDeviceReset()
#19
cwpearson
opened
4 years ago
0
Maximum Z block size is 64 for all CUDAs so far
#18
cwpearson
closed
4 years ago
1
wrong comment
#17
cwpearson
closed
4 years ago
1
measure-buf-exchange can't run with `--smpiargs="-gpu"`
#16
cwpearson
opened
4 years ago
0
Call to a CUDA runtime function without setting device first
#15
cwpearson
opened
4 years ago
0
enum constant in boolean context
#14
cwpearson
closed
4 years ago
1
Periodic hang during test/test_cuda_mpi in NodeAware Placement
#13
cwpearson
closed
4 years ago
1
`invalid IPC handle` when using CUDA-Aware MPI
#12
cwpearson
opened
4 years ago
2
`CUDA Runtime Error(46): all CUDA-capable devices are busy or unavailable` on Summit
#11
cwpearson
opened
4 years ago
4
Use detected mpi, not the mpirun in $PATH
#10
cwpearson
opened
4 years ago
0
Wrongly assumes all nodes will have the same data placement
#9
cwpearson
opened
4 years ago
0
MPI_Comm_free called after MPI_Finalize
#8
cwpearson
opened
4 years ago
0
Wrongly assume that all ranks share the same GPU topology
#7
cwpearson
opened
4 years ago
0
Compute GPU distance matrix by actual bandwidth
#6
cwpearson
opened
4 years ago
0
Deadlock in test_cuda_mpi
#5
cwpearson
closed
4 years ago
1
interface to retrieve lower bound and upper bound of DistributedDomain
#4
cwpearson
opened
4 years ago
0
Handle domains that don't divide evenly into ranks and GPUs
#3
cwpearson
closed
4 years ago
1
Allow custom allocator for DistributedDomain and LocalDomain
#2
cwpearson
opened
4 years ago
1
Overlapping communication in different directions
#1
cwpearson
closed
4 years ago
1