Open Seelengrab opened 1 year ago
While we're at it, go ahead and try for the supertype of a data type LLVMPtr. Eltype doesn't propagate to supertype
Yes, LLVMPtr
also likely shouldn't be <: Ref{T}
. @gbaraldi, is that type somewhat ready for use? It doesn't seem to have any methods associated with it:
julia> Base.Core.LLVMPtr |> methodswith
0-element Vector{Method}
It's mostly used in the LLVM.jl ecossystem
Fixing this would probably be a good time to address some missing features in our memory system. For example, it might be helpful to have something like ImmutablePtr
so that it's safe to provide a pointer for collections that shouldn't change values once set.
About a month ago I spent a weekend going through Rust's internals and playing around with some concepts and implementations for pointer types in Julia. I've been letting the ideas incubate for a while and refining it down to some basic principles that are more Julia-like.
I think we all are on the same page that we shouldn't add getindex
functionality to Ptr
because there's no guarantee that we have a valid address. However, if we force construction of a specific pointer type to ensure the address is non-null directly be the user then we could have some more reasonable semantics.
abstract type AbstractPtr{T} <: Ref{T} end
if Core.sizeof(Int) == 8
primitive type Ptr{T} <: AbstractPtr{T} 8 end
else
primitive type Ptr{T} <: AbstractPtr{T} 4 end
end
struct NonNull{T} <: AbstractPtr{T}
ptr::Ptr{T}
function unsafe_convert(::Type{NonNull{T}}, ptr) where {T}
new{T}(ptr)
end
end
struct BoxPtr{T} <: AbstractPtr{T}
ptr::NonNull{Ptr{Cvoid}}
function unsafe_convert(::Type{BoxPtr{T}}, ptr) where {T}
new{T}(unsafe_convert(NonNull{Ptr{Cvoid}}, ptr))
end
end
Base.getindex(x::NonNull) = unsafe_load(x.ptr)
function Base.getindex(x::BoxPtr)
rawptr = x.ptr[]
rawptr === C_NULL && throw(UndefRefError())
ccall(:jl_value_ptr, Any, (Ptr{Cvoid},), rawptr)
end
Without doing more complex compiler passes, I think this is about as much safety as rust provides for its pointers. There's no way to guarantee outside of site of construction of NonNull
that you have a valid pointer since any made up address can be bitcast
to Ptr
. This is why this implementation has NonNull
here is a not primitive type, because the user can only create it through an unsafe_load
or unsafe_convert
. This ensures users have some clarity going in that they are responsible for ensuring that they are passing a valid address.
This doesn't ensure that GC doesn't happen, so I don't know if this goes far enough to ensure safety though. If not, then it might still be worth having these just have an unsafe_load
type mechanism.
Guaranteeing that a pointer is non-null is not enough though - you can have invalid pointers of any value. I don't think having a type specific for saying "this is non-null" is enough for that kind of semantic, because the name implies nothing at all about the pointer being valid or not, or who's responsible for that invariant.
I vastly prefer the unsafe_
API to this.
if Core.sizeof(Int) == 8
This is a bad query; it restricts us to only being able to represent pointers on 64 and 32 bit systems, not to mention systems where pointersize and integer word-size don't match. I don't think we should leak that detail here - the pointersize ought to be something the runtime/compiler targetting a platform decides, not the host system julia happens to run on.
Guaranteeing that a pointer is non-null is not enough though - you can have invalid pointers of any value. I don't think having a type specific for saying "this is non-null" is enough for that kind of semantic, because the name implies nothing at all about the pointer being valid or not, or who's responsible for that invariant.
I vastly prefer the unsafe_ API to this.
Yeah, that's why I said we might need to still stick with that syntax. However, it does provide a bit more safety and structure even within the unsafe_load
unsafe_store!
methods. I don't think there's actually any way to make guarantees that the pointer is valid without additional context (or at least that's what I've been getting out of the Rust docs).
This is a bad query; it restricts us to only being able to represent pointers on 64 and 32 bit systems, not to mention systems where pointersize and integer word-size don't match. I don't think we should leak that detail here - the pointersize ought to be something the runtime/compiler targetting a platform decides, not the host system julia happens to run on.
This is just a small blurb for reference, imitating the annotations in "base/boot.jl". This would need to be defined in C.
I don't think there's actually any way to make guarantees that the pointer is valid without additional context (or at least that's what I've been getting out of the Rust docs).
Yes; if you have some form of convert(Ptr{T}, ::Int)
, you either need source-tracking of that Int
to make sure it only comes from a previous convert(Int, ::Ptr{T})
of the same T
, or you have potentially invalid pointers hanging around. You can gather much the same thing from LLVM documentation about ptrtoint
and inttoptr
.
There's some added difficulty in that some platforms need to have some (somewhat) arbitrary pointers defined, because that's how their registers are exposed (common on microcontrollers, but whether we ever want to actually support that officially is another matter). A way out for that would of course be to have the abstraction "register" defined as a per-platform builtin, but that's starting to get a bit off-topic for this.
You can gather much the same thing from LLVM documentation about
ptrtoint
andinttoptr
.
I actually got interested more in the pointer stuff here after looking more into "src/codegen.cpp", but then following some of the conversations in LLVM it sounds like Rust was really pushing that stuff and potentially prototyping some behaviors to move into the compiler.
One thing I'm not sure about is the overhead here. I've gleaned from some ongoing development conversations for LLVM that the large amount of bit casts are a large source of unnecessary overhead in system images. I think we can mitigate some of the compilation overhead but enforcing that specialization on loading these types is only done on NonNull
where there's a known byte type. For boxed types we don't need that.
Also, I don't think we currently support anything but 32 and 64 bit pointers. I'm not sure if that does need to be managed at this point to avoid a poor implementation.
Also, I don't think we currently support anything but 32 and 64 bit pointers.
With GPUCompiler you can generate code for devices different from where Julia runs.
Also, I don't think we currently support anything but 32 and 64 bit pointers.
With GPUCompiler you can generate code for devices different from where Julia runs.
This is super helpful to know. It also makes me wonder if we should have a pointer that is more flexible to that less intervention at the compiler level is necessary per device.
struct RawPtr{T,A<:Union{UInt8, UInt16, UInt32, UInt64, UInt128}}
address::A
end
With GPUCompiler you can generate code for devices different from where Julia runs.
With caveats, asterisks and all that stuff :') Cross compilation is just hard :shrug:
It also makes me wonder if we should have a pointer that is more flexible to that less intervention at the compiler level is necessary per device.
As I understand it, that's what LLVMPtr
is - it's representing a pointer on the LLVM level.
Before you go and abstract a pointer to just an address though, let me link you a few things so you have some context about where "pointers" as a concept in compiler land aare (likely) headed:
There's likely more up-to-date information on provenance than these, but that should get you started. The references in there are good too.
It just keeps going :)
I've stumbled across this unfortunate inconsistency:
In essence, we document that
Ref{T}
is guaranteed to point to valid julia allocated memory, which is not at all true ofPtr{T}
, butPtr{T} <: Ref{T}
still holds. There are other problems, likePtr
not havinggetindex
defined, as theRef
docstring claims:So at minimum,
Ptr
doesn't fulfill the API guaranteed byRef{T}
per the docstring.We could change the docstring of
Ref
toRef(x)
, and mention that theRef
abstract type does not make such a guarantee. Another option would be to change the name of theRef
type to something else (AbstractRef
?) and only keep theRef
function/type around as a shorthand forRefValue
, which is what it does now anyway (see@edit Ref(1)
). To keep the API the same, it would change from this:Ref{T}
Ptr{T}
RefValue{T}
RefArray{T}
to this:
AbstractRef{T}
Ptr{T}
Ref{T}
RefValue{T}
RefArray{T}
The issue with that is that we don't know whether someone relies on
Ptr{T} <: Ref{T}
somewhere (which they shouldn't, I think).