Open mppf opened 5 years ago
@benharsh - not urgent but I'm curious about your reaction to this proposal
I'm still thinking about this, but if we say "c_ptr inside of a c_ptr is narrow", what if those inner pointers are actually remote? Seems like we'd be back in the same situation.
I'm still thinking about this, but if we say "c_ptr inside of a c_ptr is narrow", what if those inner pointers are actually remote? Seems like we'd be back in the same situation.
Well, they wouldn't actually be remote, because we wouldn't allow them to be.
For example, if you had
var ptr: c_ptr(int) = ...;
var ptrPtr:c_ptr(c_ptr(int)) = ...;
ptrPtr[0] = ptr;
Then I'm expecting that in this line:
ptrPtr[0] = ptr;
the compiler would have a possibly wide RHS and a narrow LHS, and so only set the local address portion of the LHS. (It could potentially additionally assert that the RHS is local).
What if we have an array-of-arrays situation and the user wants to access the data on another locale:
config const n = 10, m = 10;
var c_data : c_ptr(c_ptr(int));
// allocate c_data as n*m array
on Locales[numLocales-1] {
for (i,j) in zip(1..n, 1..m) do
c_data[i][j] = i * j;
}
Edit: or are you saying that we would just not support that, either through documentation or a compiler error/warning?
or are you saying that we would just not support that, either through documentation or a compiler error/warning?
Yes. We have Chapel arrays for such things.
I'd like to have something we could use in string
that can be wide, and I want it to be user-facing (so _ddata
isn't great). We could get c_ptr
to do it (which is what I was advocating for here) but the obvious alternative is to add some other thing, like just ptr
. (But then I think I'd be annoyed when explaining why c_ptr
was always local but ptr
isn't, etc). I also don't like c_wide_ptr
because it implies it is always wide, but maybe there's a reasonable path there too.
I guess I lean towards exposing _ddata under another name (not sure about 'ptr'). I feel like users will want that eventually anyways, so having 'ptr' and a possibly-wide c_ptr seems weird.
If we keep c_ptr as always-local, that might allow us to emit better warnings at compile-time.
I guess I lean towards exposing _ddata under another name
Can you suggest any names that we could use?
@benharsh - re. names for a wide c_ptr aka _ddata
, we might use
ptr
global_ptr
global_c_ptr
wide_c_ptr
Any other ideas?
I think the main questions I have about it are:
I think the answers to these are "no" and "no", which makes me prefer global_ptr
. I think ptr
is a little to easy for this low-level thing.
Oh yeah, and then maybe it would be globalPtr
for camelCase reasons.
@LouisJenkinsCS - I take you to mean that you think the wide version of c_ptr
should be called ptr
?
I think Ptr
should be a generalization of a pointer that is not specific to C pointers or wide pointers. As in, you do not need an explicit _ddata
or c_ptr
, but just a Ptr
which internally can handle locality. It would be a way of saying "I don't care about locality, just that I have a pointer to some data. If it needs to be coerced into a c_ptr
and it happens to be remote, then copy the data somewhere locally first. If I access it from another locale, handle performing a remote GET for me. If I need to store into the pointer, perform a remote PUT for me. Etc."
@LouisJenkinsCS - I think you are saying that ptr
would be wide if needed and narrow if needed? Rather than always wide or narrow? Or are you just saying that those operations could work (which could be done if ptr
were always wide)?
Or are you just saying that those operations could work (which could be done if ptr were always wide)?
So long as Ptr
can be coerced into a narrow c_ptr
, then yes this is exactly what I'd want. However I'd like for it to be narrow if the compiler can determine statically that it is never used in a distributed context, or is in a local
block.
If it needs to be coerced into a c_ptr and it happens to be remote, then copy the data somewhere locally first.
Did you mean the compiler would do this automatically? Typically C-ish pointers don't include a size, so how could it know how much to copy? Besides that, making copies can cause confusing behavior if the original is modified...
Then how would you propose the c_ptr
to be used for 'C Interoperability'? If its meant to be used from Chapel alone, I.E to directly chpl_comm_put
/chpl_comm_get
on, then I'm all for it but I think it should be a different kind of pointer, I.E a c_wide_ptr
(as you called it), or just ptr
.
C code could include header files defining a wide void*
and a wide c_ptr could coerce to that type.
Even though it could be used for interoperability with C code meant to work with Chapel, I wouldn't expect it to be used with existing C libraries.
I think you are agreeing with @benharsh that instead of making c_ptr
sometimes wide, we should introduce a new type like c_wide_ptr
or something. We just need to reach some sort of agreement on the name.
As I described in https://github.com/chapel-lang/chapel/issues/12490#issuecomment-480505878 I'm leaning towards calling it a "global pointer". I think ptr
might be too short/easy (we want it to be a low-level feature not something considered essential to Chapel programming). I think using c
in the name runs into problems with the issue that it doesn't necessarily make sense in C interoperability.
This is related to #8680 and #11115 but asks a particular design question.
Consider this code in String.chpl:
https://github.com/chapel-lang/chapel/blob/2831d4bb0893e1e281bec2c8972c340f77d2f341/modules/internal/String.chpl#L651-L657
This code has the unfortunate property of copying the entire string in the event that it is not local, in order to access 1 byte.
Why does it do that? The string data type uses
c_ptr
to implement the buffer. Thec_ptr
type is not currently ever wide and so the string additionally includes a field indicating the locale owning thec_ptr
.This could be resolved if the string type used
_ddata
which is effectively a version ofc_ptr
that can be wide. However there is some desire for the string data type to use user-facing features where possible and that begs the question of if we can create a user-facing form of_ddata
?But why do we have both
c_ptr
and_ddata
?c_ptr
was created for C interoperability (and extern blocks). In that setting, we needed something that could represent nested pointer structures. For example, if we pass ac_ptr(c_ptr(c_int))
to a C function, the C function is expecting anint**
argument. While the compiler can remove wideness for the outer-most argument to the function, it wouldn't be able to do that for the inner argument.Here I propose that we make
c_ptr
be possibly wide. The use of it in thestring
data type in particular would be wide. When code-generating passing a widec_ptr
to an extern function, the compiler would pass only the address part (and possibly assert that it is local).But how to address the
c_ptr(c_ptr(t))
problem? Instead of making allc_ptr
narrow, only make certainc_ptr
used in a pattern such as this narrow. I believe we can adjust insertWideReferences to do this.c_ptr
in an extern record is narrowc_ptr
inside of ac_ptr
is narrow