JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
44.99k stars 5.42k forks source link

Subtle change in Array behavior w.r.t finalizers after Memory #54985

Open vchuravy opened 1 week ago

vchuravy commented 1 week ago

This probably needs documentation. I just came across someone who wrote the following code (paraphrased):


ptr = Base.reinterpret(Ptr{Float64}, Libc.malloc(64*sizeof(Float64)))
var1 = Base.unsafe_wrap(Array, ptr, 64, own=false)

# I believe in 1.11 one would need to attach the finalizer to `var1.ref.mem`
finalizer(var1) do var
    @async println("Finalizer ran")
    Libc.free(pointer(var))
end

var2 = reshape(var1, 1, 64) # typeof(var2) == Matrix
var1 = nothing
GC.gc() # On 1.10 finalizer didn't run; on 1.11 finalizer did run

# Is access to var2 legal?
var2 = nothing
GC.gc() # finalizer now ran

Before the change to Memory the lifetime of the memory buffer was tight to Array so the finalizer there had the intended effect. Now with Memory we are attaching the finalizer to the "wrong" object. I did not see this documented as a change, but it probably ought to be.

vtjnash commented 1 week ago

I am tempted to add a deprecation for finalizers attached to Array, similar to other immutable objects that disallow finalizers

KristofferC commented 1 week ago

Ref https://github.com/JuliaLang/julia/issues/54128#issuecomment-2069846970 where the map stdlib hit this.

Joroks commented 1 week ago

It seems like using a finalizer on a memory object to free memory causes the GC to perform significantly worse compared to registering the finalizer directly on the Array - both in 1.10 where this would still be the correct approach and in julia 1.11 where this could cause segfaults. I've tested this somewhat crudely by setting up a loop that repeatedly allocates memory through a external C-Library (with 10^8 iterations), attaching the free method of that library to the finalizer, and then watching the memory consumption. (Ref the discussion on cesmix-mit/LAMMPS.jl#51 for further details)

when attaching the finalizer on the array directly, my memory consumption looks like this (both in 1.10 and 1.11):

344492495-678446be-ba2a-485e-baa1-ca99d84ebe2f

When registering the finalizer on the memory obect:

344575806-1301457a-7f91-486c-8434-24d2358d4b9a

and I had to call GC.gc() twice to fully free the allocated memory:

344576353-b0348dba-5925-4632-9d03-1fd9c908fe59

I'm running on Windows 10 with an AMD prozessor