Mask inconsistencies - Githubissues

timfelle commented 1 month ago

There are inconsistencies in how masks are dealt with in different parts of the code. We should agree on a single approach and align to that.

Some parts of the code uses an array of integers with length m to define a mask (an example is the point_zone_t. Other parts of the code uses an array of integers with the size prepended, so it have length m+1.

Some masked math operations assume one, others assume the other.

@MartinKarp I know you were a strong proponent of prepending the size to the arrays, can you describe why that was useful?

MartinKarp commented 1 month ago

I think your masks in points zones without the 0 index is the way to go! Victor seems to have brought this up on Zulip on the 7th of May this year and I think there was more or less consensus on scrapping the size of the mask in the 0 index would be the way to go.

What I remember I was more sceptical towoard using c indexing for the mask_d.

timfelle commented 1 month ago

What do you suggest for the device masks then?

MartinKarp commented 1 month ago

Thinking about it I thought something like this I guess. No mask has the size in the beginning.

Mask (fortran indexed)
c_mask_d (c indexed mask on device)

Optionally

mask_d (fortran indexed mask on device)
c_mask (c indexed mask on host)

I am mainly opposed to using the subscore _d for a variable x when it is not the same data in x and x_d. I dont think of it in terms of functionality but as values stored kinda.

How would you like the masks to look optimally?

timfelle commented 1 month ago

Well I focus on the usage of the masks, so when i run through any masked loop i expect the mask to reflact that it is a mask.

for i in mask
    a(i) = whatever,

regardless of language. And we have so far used the _d suffix to indicate, "the same but on device", such that a user will just need to append the _d to their variables and then the functionality of what they are doing is the same.

I think it will cause more confusion if masks are to be used differently than field_t, vector_t, matrix_t, dofmap_t, space_t and essentially all other types in Neko.

MartinKarp commented 1 month ago

but c_mask would fix this or?

I do not think of it as you I guess. I think more like

for i in size(mask):
     mask(i) = mask_d(i)

I do not think it mimics functionality as much as it mirrors the data to the device. Is this different from the other types?

timfelle commented 1 month ago

Yes it is different from the other data points as v%x(1) is not the same as v%x_d(1) since the array on cpu is stored as a 1-index and the device 0-index on the device. Meaning for all the other types v%x(1) == v%x_d(0).

MartinKarp commented 1 month ago

Got me there ;)

By data I guess I really mean I expect the data to be the same, regardless if the programming language I use is 0 or 1 indexed, _d simply denotes whether it is on device or not.

So it is really a question of functionality or data I think.

timfelle commented 1 month ago

Well it is all a question of naming. The direct copy of mask would not be useful to have. So it is a question of wether to call the device mask c_ or _d.

MartinKarp commented 1 month ago

I would be all for c_mask_d and scrapping mask_d

timfelle commented 1 month ago

Why do you want both?

MartinKarp commented 1 month ago

because _d denotes it is on device and c_ denotes it is c (0) indexed.

timfelle commented 1 month ago

But the _d would be redundant as c_mask would not exist, and the _d in your definition have been used to indicate a device copy of the host variable.

timofeymukha commented 1 month ago

I am as strong proponent of not prepending the size in the 0 index and generally using non-standard indexing in Fortran, because things become a mess once you need to pass that array as a dummy argument, or access it via a pointer in another type. By default, it will be reindexed to start from 1 again, so one has to really take care to either avoid that or take it into account.

MartinKarp commented 1 month ago

But the _d would be redundant as c_mask would not exist, and the _d in your definition have been used to indicate a device copy of the host variable.

I guess I in principle would want both a c_mask and a mask_d as well. I could see scenarios where one would use a programming model with 1 indexing on the gpu or 0 indexing on the cpu in the future. It did not include them now mostly because for the current state of the code it would just use a bit of memory.

timofeymukha commented 1 month ago

But one can't truly have a 0 indexed array in fortran, right? It is essentially always a local "view", and once you pass the array elsewhere it is 1-based. Or am I wrong? If I am not, I would say one should create a local 0-based view if one really wants to but the default layout in the definition inside a type should be 1 based.

timfelle commented 1 month ago

But one can't truly have a 0 indexed array in fortran, right? It is essentially always a local "view", and once you pass the array elsewhere it is 1-based. Or am I wrong? If I am not, I would say one should create a local 0-based view if one really wants to but the default layout in the definition inside a type should be 1 based.

His point is that we might write C, C++ or some other code, not running on a device but still call it from Neko. And that CUDA Fortran does exists, which is a 1-index device language.

timofeymukha commented 1 month ago

I see. I guess having all sorts of masks can be useful, yeah.

timfelle commented 1 month ago

It seems a concencus is made:

Masks should be defined as follows, in order to explicitly state their location and purpose.

mask_0, host array representing a 0-indexed mask.
mask_1, host array representing a 1-indexed mask.
mask_0_d, device array representing a 0-indexed mask.
mask_1_d, device array representing a 1-indexed mask.

ExtremeFLOW / neko

Mask inconsistencies #1549