JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.77k stars 5.49k forks source link

`a::Array{T, N>1}` assumes that `length(a) = length(a.ref.mem)` #52048

Closed MasonProtter closed 1 year ago

MasonProtter commented 1 year ago

Compare

julia> a = ones(2, 2)
2×2 Matrix{Float64}:
 1.0  1.0
 1.0  1.0

julia> setfield!(a, :ref, MemoryRef(Memory{Float64}(undef, 9) .= 1));
       a
2×2 Matrix{Float64}:
 1.0  1.0
 1.0  1.0

julia> size(a)
(2, 2)

julia> length(a)
9

versus

julia> v = ones(2)
2-element Vector{Float64}:
 1.0
 1.0

julia> setfield!(v, :ref, MemoryRef(Memory{Float64}(undef, 3) .= 1));
       v
2-element Vector{Float64}:
 1.0
 1.0

julia> length(v)
2

julia> length(v.ref.mem)
3

This is problematic because I want to add a wrap function which creates an Array out of a Memory or MemoryRef, and ideally we'd want to allow the use of oversized Memory to back multidimensional arrays i.e. one could imagine a bump allocator like

let mem = Memory{Int}(undef, 1000000)
    offset = 0
    a1 = wrap(Array, MemoryRef(mem, offset), (5, 5))
    offset += length(a1)
    f(a1)    

    a2 = wrap(Array, MemoryRef(mem, offset), (4, 3))
    offset += length(a2)
    g(a2)

    ...
end

Some options going forward are

  1. do nothing and continue assuming that length(a) = length(a.ref.mem)
  2. calculate length via prod(a.dims) (some benchmarking from @LilithHafner suggests this is fine when ndims < 5, but would cause performance problems when people assume length is free for high dimensional arrays.
  3. add a length field to Array. This could be seen as ugly but it has one advantage that it means we don't need to special case the constness of the size field anymore to be const for ndims > 1 and mutable for ndims = 1.
vtjnash commented 1 year ago

Accessing or mutating the fields of an object is considered to be be non-public API, and does not have a defined semantics, so it does not need to have any defined behavior either

MasonProtter commented 1 year ago

@vtjnash I'm aware that it's not meant to be mutated, that mutation was just for a quick MWE. The reason it's a problem is https://github.com/JuliaLang/julia/pull/52049 where one may want to wrap a Memory that's oversized for the array they're making.

oscardssmith commented 1 year ago

Note that very similar things will happen if you change one of the arrays backing a Dict to have a non-power of 2 size.