Closed LutzCle closed 5 years ago
How would you devise a test case to catch these types of bugs?
Huh, you're right. Well, that's kind of embarrassing. Thanks for the pull request (and thanks for noticing this mistake).
The only way I can think of to check if we're allocating too much memory is to deliberately write to a memory location that should be out of bounds and make sure that that fails, but I'm not really sure that will work.
This change will introduce a slight change in behavior that will need to be fixed, however. In the current code, DeviceBuffer and related types will only allocate if their backing array is more than zero bytes. In your change, they'll allocate if the backing array is more than zero elements. It might not seem like there's a difference, but zero-sized types are a thing; you could (in principle) create a DeviceBuffer::zeroed::<()>(100)
, and because there are 100 elements this would incorrectly attempt to allocate a zero-length buffer.
Hi there!
I wanted to ask what would be needed to fix this issue.
Maybe, simply change the last bytes
for size
, say, in DeviceBuffer:
pub unsafe fn uninitialized(size: usize) -> CudaResult<Self> {
let bytes = size
.checked_mul(mem::size_of::<T>())
.ok_or(CudaError::InvalidMemoryAllocation)?;
let ptr = if bytes > 0 {
cuda_malloc(size)?
}
... but then there would be two multiplications, right?
1) Defining bytes
2) Allocating in cuda_malloc
Ok, so zero-sized types are now handled correctly. There is no need to calculate bytes
twice, it's enough to check if the type's size is greater than 0.
In addition, I've added tests to check for out-of-memory errors in DeviceBuffer
and UnifiedBuffer
.
However, LockedBuffer
only has a placeholder test. My testing method would have to query the available memory in the system, but I'm not aware of a platform-independent way to do that. I'm open to suggestions here.
Looks good, thanks for the pull request!
DeviceBuffer
,LockedBuffer
, andUnifiedBuffer
allocate several times more memory than requested. For example:allocates 8 GB instead of the requested 1 GB. This can result in an unexpected OutOfMemory error.
The bugs occur because the respective
*_malloc()
wrappers already account for the scaling from "count of T" to bytes. Thus, count is multiplied withsize_of::<T>()
twice.This fix simply removes the superfluous multiplications.