cwpearson stencil issues

cwpearson / stencil

A prototype MPI/CUDA stencil communication library

Boost Software License 1.0

10 stars 3 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

DevicePacker doesn't pack current data after swap when using cudaGraph API

#30 cwpearson opened 3 years ago
0
Support non-pitched allocations

#29 cwpearson opened 3 years ago
1
Create LICENSE

#28 cwpearson closed 3 years ago
0
MPI_Isend in remote sender can be very delayed

#27 cwpearson closed 4 years ago
2
USE_CUDA_GRAPH causes test_cuda failures on cuda/10.1.243

#26 cwpearson closed 4 years ago
1
STENCIL_SETUP_STATS should track different MPI and CUDA copy sizes

#25 cwpearson opened 4 years ago
0
Handling CUDA_VISIBLE_DEVICES

#24 cwpearson opened 4 years ago
0
STENCIL_SETUP_STATS should record same-GPU, same-node, and off-node comm totals

#23 cwpearson opened 4 years ago
0
CMake 3.18.1 policy warning with unset CMAKE_CUDA_ARCHITECTURES

#22 cwpearson opened 4 years ago
0
4-rank run under `cuda-memcheck` on css-host-yz-20 can fail with `Program hit cudaErrorPeerAccessAlreadyEnabled (error 704) due to "peer access is already enabled" on CUDA API call to cudaDeviceEnablePeerAccess.`

#21 cwpearson closed 4 years ago
1
Self-exchange with `CudaMpi | CudaMpiColocated` happens with `CudaMpi` instead of `CudaMpiColocated`

#20 cwpearson opened 4 years ago
0
Test fails if not started with cudaDeviceReset()

#19 cwpearson opened 4 years ago
0
Maximum Z block size is 64 for all CUDAs so far

#18 cwpearson closed 4 years ago
1
wrong comment

#17 cwpearson closed 4 years ago
1
measure-buf-exchange can't run with `--smpiargs="-gpu"`

#16 cwpearson opened 4 years ago
0
Call to a CUDA runtime function without setting device first

#15 cwpearson opened 4 years ago
0
enum constant in boolean context

#14 cwpearson closed 4 years ago
1
Periodic hang during test/test_cuda_mpi in NodeAware Placement

#13 cwpearson closed 4 years ago
1
`invalid IPC handle` when using CUDA-Aware MPI

#12 cwpearson opened 4 years ago
2
`CUDA Runtime Error(46): all CUDA-capable devices are busy or unavailable` on Summit

#11 cwpearson opened 4 years ago
4
Use detected mpi, not the mpirun in $PATH

#10 cwpearson opened 4 years ago
0
Wrongly assumes all nodes will have the same data placement

#9 cwpearson opened 4 years ago
0
MPI_Comm_free called after MPI_Finalize

#8 cwpearson opened 4 years ago
0
Wrongly assume that all ranks share the same GPU topology

#7 cwpearson opened 4 years ago
0
Compute GPU distance matrix by actual bandwidth

#6 cwpearson opened 4 years ago
0
Deadlock in test_cuda_mpi

#5 cwpearson closed 4 years ago
1
interface to retrieve lower bound and upper bound of DistributedDomain

#4 cwpearson opened 4 years ago
0
Handle domains that don't divide evenly into ranks and GPUs

#3 cwpearson closed 4 years ago
1
Allow custom allocator for DistributedDomain and LocalDomain

#2 cwpearson opened 4 years ago
1
Overlapping communication in different directions

#1 cwpearson closed 4 years ago
1