Open upsj opened 4 years ago
Coming back to this, since @thoasm and I discussed some related issues or questions about code style, I want to propose a few changes that might clean up our code base.
signed
. The rationale for this is simple: We do this for all (but dense
) kernels anyways and many popular style guides (Google and the C++ Core Guidelines) recommend avoiding unsigned types (except for bit patterns).
This would also mean making the size type for Array
signed and throwing an exception for size<0
.
Concerning overflows, this is of course a complex topic, but I would leave it at the following: In case you have a matrix with nnz >= MAX_INT, we would expect you to use int64 types, otherwise there is nothing we can do. Since the largest indices that occur in our kernels are limited by nnz (neglecting "cutoff" thread IDs for large block sizes and counts and/or the use of a whole subwarp per row/non-zero, which we could solve by using the y and z indices), I see no reason why we should expect our code to overflow. And if it does (and the numbers are not extraordinarily big), a negative number is a better indicator for overflown values than a small value where you would not expect it.signed
values in case they don't do that already. This includes the CUDA/HIP cooperative groups, intrinsics, but also the threadIdx and blockIdx/Dim variables, which we could wrap either in a get_thread_id<IndexType>
function or use our cooperative_groups grid group everywhere.static_cast
and avoid functional-style casts for integers everywhere. This is also a recommendation from the C++ Core Guidelines, ES.48/49)Comments?
I agree, making everything a signed
value will definitively remove the need for a lot of casts. For sizes, I think you are correct that we can give up the "extra bit" we would have with unsigned
and just move to 64 bit a bit earlier.
However, even if we change the size_type
to int64
, I am pretty sure it is considered interface breaking, so we have to wait for the 2.0
version to make that change.
Yes, that would be a long-term goal. Maybe we can collect other changes/issues that this switch would cause. Until then, I'll try to use static_cast where possible
Changing to signed
is also easy to write the code in my experience.
In some cases, I just want to run the size - 1
or size - m
loops. for i = 0: size-m
is failed when size is less than m.
For nnz >= MAX_INT, (the num stored elements) Add IndexType as Array template varaible to check whether the number of stored elements does not overflow such that we do not need to check it in each class. ( Maybe Dense also needs it)
After working on #464, I now have a slightly clearer picture of what the impact of the proposed changes would be, so my updated proposal would be:
size_type
signed and only use unsigned types for the internal allocation sizes. This could also provide better optimization opportunities (especially for reference and omp, where we use this types excessively), since the compiler can then assume that no overflows occur.size_type
for all matrix dimensions, even in kernel parameters. It seems to me that we only use these size parameters for bounds checking like if (tidx >= num_rows) return;
, where it would not matter if tidx is a smaller IndexType
. This would allow us to get rid of all num_rows = static_cast<IndexType>(...)
and similar statements, since there is no more template parameter deduction mismatch between size_type
and IndexType
.IndexType
(or size_type
for dense) for indexing within kernels, where the actual computations take place. There are of course places where we need to be careful to avoid overflows (multiplications etc.), but it looks to me like these mostly occur in the dense kernels anyways. This policy would probably allow us to get rid of all index casts (or hide them in calls such as get_thread_id_flat
)Comments?
It seems to me Ginkgo is sometimes integer types inconsistently. This includes both 32/64 bit as well as
signed
/unsigned
. Examples:dim
returns asize_type
for all dimensions, even though we internally use theIndexType
template parameter in almost all kernels.size_type
as loop variables.size_type
My opinion: I would consistently use signed types everywhere, because we use them for indexing anyways and overflows on signed types are way easier to recognize/catch when debugging. Only for allocation sizes and/or launch bounds, it might make sense to use
size_type
. But that is often a very fundamental debate, so I would like to start it here.On the question what type
dim
should use, I am conflicted, since it would probably be annoying having to typedim<IndexType>
everywhere, but in my opinion, it should at least return a signed type.