JuliaGeo / LibGEOS.jl

Julia package for manipulation and analysis of planar geometric objects
MIT License
72 stars 24 forks source link

Crashes when using @threads with intersection functions #91

Open nicholasbalasus opened 2 years ago

nicholasbalasus commented 2 years ago

Currently, I have two arrays of type Array{Polygon,1}, called grid_1 and grid_2. I would like to find the intersecting area of every point of grid_1 with every point of grid_2 (i.e., yield a matrix with dimensions length(grid_1) by length(grid_2). I am wondering if there is anyway that may be faster that running a two for loops as follows?

polygon_intersection = Matrix{Float64}(undef,length(grid_1),length(grid_2));
for i = 1:length(grid_1)
    for j = 1:length(grid_2)
        polygon_intersection[i,j] = area(intersection(grid_1[i],grid_2[j]))
    end
end

I was thinking of using something of the form area.(intersection.(grid_1[i],grid_2)) and then only have to loop over one dimension, but that does not appear to be implemented (and I am not sure these array calculations would even be faster than a for loop). Any suggestions are appreciated!

visr commented 2 years ago

In julia there is generally no performance penalty on loops, so I wouldn't worry about getting rid of them.

nicholasbalasus commented 2 years ago

Got it, thanks!

jaakkor2 commented 2 years ago

You could add Threads.@threads in front of the first for-loop.

evetion commented 2 years ago

I'm not sure if libGEOS is thread safe, so be careful with using threads.

jaakkor2 commented 2 years ago

From https://trac.osgeo.org/geos/wiki/RFC3

Function names in the new API will be updated with an _r, as is the familiar C standard for reentrant/thread safe versions.

GEOSIntersection_r is used here https://github.com/JuliaGeo/LibGEOS.jl/blob/v0.6.8/src/geos_functions.jl#L423-L429

EDIT: Be careful with using threads.

nicholasbalasus commented 2 years ago

I am not sure if this is worth opening another issue, but I am having some troubles with these functions being thread safe I believe. When I run the following MWE code (written for this question)

using LibGEOS

grid_1 = Dict();
# c = center, ll = lower left corner, ul = upper left corner
# ur = upper right corner, lr = lower right corner
grid_1["x_c"] = collect(0:0.5:100);
grid_1["y_c"] = collect(0:0.5:100);
grid_1["x_ll"] = grid_1["x_c"] .- (0.5/2);
grid_1["x_ul"] = grid_1["x_c"] .- (0.5/2);
grid_1["x_ur"] = grid_1["x_c"] .+ (0.5/2);
grid_1["x_lr"] = grid_1["x_c"] .+ (0.5/2);
grid_1["y_ll"] = grid_1["y_c"] .- (0.5/2);
grid_1["y_ul"] = grid_1["y_c"] .+ (0.5/2);
grid_1["y_ur"] = grid_1["y_c"] .+ (0.5/2);
grid_1["y_lr"] = grid_1["y_c"] .- (0.5/2);
grid_1["value"] = rand(length(grid_1["x_c"]));
grid_1["uncertainty"] = rand(length(grid_1["x_c"]));

grid_2 = Dict();
grid_2["x_c"] = collect(1:0.5:101);
grid_2["y_c"] = collect(1:0.5:101);
grid_2["x_ll"] = grid_2["x_c"] .- (0.5/2);
grid_2["x_ul"] = grid_2["x_c"] .- (0.5/2);
grid_2["x_ur"] = grid_2["x_c"] .+ (0.5/2);
grid_2["x_lr"] = grid_2["x_c"] .+ (0.5/2);
grid_2["y_ll"] = grid_2["y_c"] .- (0.5/2);
grid_2["y_ul"] = grid_2["y_c"] .+ (0.5/2);
grid_2["y_ur"] = grid_2["y_c"] .+ (0.5/2);
grid_2["y_lr"] = grid_2["y_c"] .- (0.5/2);
grid_2["value"] = rand(length(grid_2["x_c"]));
grid_2["uncertainty"] = rand(length(grid_2["x_c"]));

for grid in (grid_1,grid_2)
    grid["polygon"] = Vector{Polygon}(undef,length(grid["x_c"]));

    # makes polygons from the pixcorners for both grids
    for idx = 1:length(grid["polygon"])
        grid["polygon"][idx] = Polygon([[[grid["x_ll"][idx],grid["y_ll"][idx]],[grid["x_ul"][idx],grid["y_ul"][idx]],[grid["x_ur"][idx],grid["y_ur"][idx]],[grid["x_lr"][idx],grid["y_lr"][idx]],[grid["x_ll"][idx],grid["y_ll"][idx]]]]);
    end

    grid["polygon_area"] = area.(grid["polygon"]);
end

# averages grid_1 value to grid_2 
function F_l2_to_l2_regrid(grid_1,grid_2)
    grid_2["grid_1_value"] = Vector{Float64}(undef,length(grid_2["polygon"]));

    Threads.@threads for j = 1:length(grid_2["polygon"])
        A = Vector{Float64}(undef,length(grid_1["polygon"]));
        B = Vector{Float64}(undef,length(grid_1["polygon"]));

        p = 1;
        polygon_intersection = Vector{Float64}(undef,length(grid_1["polygon"]));
        for i = 1:length(grid_1["polygon"])
            polygon_intersection[i] = area(intersection(grid_1["polygon"][i],grid_2["polygon"][j]));
            A[i] = grid_1["value"][i]*polygon_intersection[i]/(grid_1["uncertainty"][i]^p)/grid_2["polygon_area"][j];
            B[i] = polygon_intersection[i]/(grid_1["uncertainty"][i]^p)/grid_2["polygon_area"][j];
        end

        grid_2["grid_1_value"][j] = sum(A)/sum(B);

    end

end

F_l2_to_l2_regrid(grid_1,grid_2)

I get a very long error that kills my Julia session, specifically:

*** Error in `julia': double free or corruption (fasttop): 0x00002b760c0cc660 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x2b75f0fa0329]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so(_ZNK4geos4geom8Geometry19getEnvelopeInternalEv+0x48)[0x2b763b6cd988]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so(_ZNK4geos4geom7Polygon23computeEnvelopeInternalEv+0x19)[0x2b763b6ddd59]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so(_ZNK4geos4geom8Geometry19getEnvelopeInternalEv+0x2a)[0x2b763b6cd96a]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so(_ZN4geos9operation9overlayng11OverlayUtil13isEnvDisjointEPKNS_4geom8GeometryES6_PKNS3_14PrecisionModelE+0x92)[0x2b763b767892]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so(_ZN4geos9operation9overlayng9OverlayNG9getResultEv+0x46)[0x2b763b764b26]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so(_ZN4geos9operation9overlayng9OverlayNG7overlayEPKNS_4geom8GeometryES6_iPKNS3_14PrecisionModelE+0x65)[0x2b763b764cd5]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so(_ZN4geos9operation9overlayng15OverlayNGRobust7OverlayEPKNS_4geom8GeometryES6_i+0xa1)[0x2b763b765e91]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so(_ZN4geos4geom16HeuristicOverlayEPKNS0_8GeometryES3_i+0x235)[0x2b763b6d6a05]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so(_ZNK4geos4geom8Geometry12intersectionEPKS1_+0x64)[0x2b763b6cec24]
/n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos_c.so(GEOSIntersection_r+0x27)[0x2b763b3c96f7]
[0x2b762c695582]
[0x2b762c695d0e]
[0x2b762c694767]
[0x2b762c69539f]
[0x2b762c6953bd]
/n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/bin/../lib/libjulia.so.1(+0xccc86)[0x2b75f0211c86]
======= Memory map: ========
00400000-00402000 r-xp 00000000 00:2d 4474665034                         /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/bin/julia
00602000-00603000 r--p 00002000 00:2d 4474665034                         /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/bin/julia
00603000-00604000 rw-p 00003000 00:2d 4474665034                         /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/bin/julia
01d89000-029a5000 rw-p 00000000 00:00 0                                  [heap]
2b75eff21000-2b75eff43000 r-xp 00000000 fd:00 33669052                   /usr/lib64/ld-2.17.so
2b75eff43000-2b75eff45000 rw-p 00000000 00:00 0 
2b75eff45000-2b75eff48000 r--p 00000000 00:00 0 
2b75eff48000-2b75eff58000 rwxp 00000000 00:00 0 
2b75eff58000-2b75eff5b000 rw-p 00000000 00:00 0 
2b75eff60000-2b75f0139000 rw-p 00000000 00:00 0 
2b75f0139000-2b75f0140000 r--s 00000000 fd:00 100816988                  /usr/lib64/gconv/gconv-modules.cache
2b75f0142000-2b75f0143000 r--p 00021000 fd:00 33669052                   /usr/lib64/ld-2.17.so
2b75f0143000-2b75f0144000 rw-p 00022000 fd:00 33669052                   /usr/lib64/ld-2.17.so
2b75f0144000-2b75f0145000 rw-p 00000000 00:00 0 
2b75f0145000-2b75f03c3000 r-xp 00000000 00:2d 4450091073                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/libjulia.so.1.5
2b75f03c3000-2b75f05c2000 ---p 0027e000 00:2d 4450091073                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/libjulia.so.1.5
2b75f05c2000-2b75f05c8000 r--p 0027d000 00:2d 4450091073                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/libjulia.so.1.5
2b75f05c8000-2b75f061a000 rw-p 00283000 00:2d 4450091073                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/libjulia.so.1.5
2b75f061a000-2b75f08f7000 rw-p 00000000 00:00 0 
2b75f08f7000-2b75f08f9000 r-xp 00000000 fd:00 33689620                   /usr/lib64/libdl-2.17.so
2b75f08f9000-2b75f0af9000 ---p 00002000 fd:00 33689620                   /usr/lib64/libdl-2.17.so
2b75f0af9000-2b75f0afa000 r--p 00002000 fd:00 33689620                   /usr/lib64/libdl-2.17.so
2b75f0afa000-2b75f0afb000 rw-p 00003000 fd:00 33689620                   /usr/lib64/libdl-2.17.so
2b75f0afb000-2b75f0b02000 r-xp 00000000 fd:00 33872652                   /usr/lib64/librt-2.17.so
2b75f0b02000-2b75f0d01000 ---p 00007000 fd:00 33872652                   /usr/lib64/librt-2.17.so
2b75f0d01000-2b75f0d02000 r--p 00006000 fd:00 33872652                   /usr/lib64/librt-2.17.so
2b75f0d02000-2b75f0d03000 rw-p 00007000 fd:00 33872652                   /usr/lib64/librt-2.17.so
2b75f0d03000-2b75f0d1a000 r-xp 00000000 fd:00 33669082                   /usr/lib64/libpthread-2.17.so
2b75f0d1a000-2b75f0f19000 ---p 00017000 fd:00 33669082                   /usr/lib64/libpthread-2.17.so
2b75f0f19000-2b75f0f1a000 r--p 00016000 fd:00 33669082                   /usr/lib64/libpthread-2.17.so
2b75f0f1a000-2b75f0f1b000 rw-p 00017000 fd:00 33669082                   /usr/lib64/libpthread-2.17.so
2b75f0f1b000-2b75f0f1f000 rw-p 00000000 00:00 0 
2b75f0f1f000-2b75f10e3000 r-xp 00000000 fd:00 33669060                   /usr/lib64/libc-2.17.so
2b75f10e3000-2b75f12e2000 ---p 001c4000 fd:00 33669060                   /usr/lib64/libc-2.17.so
2b75f12e2000-2b75f12e6000 r--p 001c3000 fd:00 33669060                   /usr/lib64/libc-2.17.so
2b75f12e6000-2b75f12e8000 rw-p 001c7000 fd:00 33669060                   /usr/lib64/libc-2.17.so
2b75f12e8000-2b75f12ed000 rw-p 00000000 00:00 0 
2b75f12ed000-2b75f131b000 r-xp 00000000 00:2d 4474995838                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/libunwind.so.8.0.1
2b75f131b000-2b75f151a000 ---p 0002e000 00:2d 4474995838                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/libunwind.so.8.0.1
2b75f151a000-2b75f151b000 r--p 0002d000 00:2d 4474995838                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/libunwind.so.8.0.1
2b75f151b000-2b75f151c000 rw-p 0002e000 00:2d 4474995838                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/libunwind.so.8.0.1
2b75f151c000-2b75f1526000 rw-p 00000000 00:00 0 
2b75f1526000-2b75f4746000 r-xp 00000000 00:2d 4431219297                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/libLLVM-9jl.so
2b75f4746000-2b75f4946000 ---p 03220000 00:2d 4431219297                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/libLLVM-9jl.so
2b75f4946000-2b75f4c1d000 r--p 03220000 00:2d 4431219297                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/libLLVM-9jl.so
2b75f4c1d000-2b75f4c44000 rw-p 034f7000 00:2d 4431219297                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/libLLVM-9jl.so
2b75f4c44000-2b75f4c89000 rw-p 00000000 00:00 0 
2b75f4c89000-2b75f4e52000 r-xp 00000000 00:2d 4475100338                 /n/sw/helmod/apps/centos7/Core/gcc/9.3.0-fasrc01/lib64/libstdc++.so.6.0.28
2b75f4e52000-2b75f5052000 ---p 001c9000 00:2d 4475100338                 /n/sw/helmod/apps/centos7/Core/gcc/9.3.0-fasrc01/lib64/libstdc++.so.6.0.28
2b75f5052000-2b75f505d000 r--p 001c9000 00:2d 4475100338                 /n/sw/helmod/apps/centos7/Core/gcc/9.3.0-fasrc01/lib64/libstdc++.so.6.0.28
2b75f505d000-2b75f5060000 rw-p 001d4000 00:2d 4475100338                 /n/sw/helmod/apps/centos7/Core/gcc/9.3.0-fasrc01/lib64/libstdc++.so.6.0.28
2b75f5060000-2b75f5063000 rw-p 00000000 00:00 0 
2b75f5063000-2b75f5164000 r-xp 00000000 fd:00 33689622                   /usr/lib64/libm-2.17.so
2b75f5164000-2b75f5363000 ---p 00101000 fd:00 33689622                   /usr/lib64/libm-2.17.so
2b75f5363000-2b75f5364000 r--p 00100000 fd:00 33689622                   /usr/lib64/libm-2.17.so
2b75f5364000-2b75f5365000 rw-p 00101000 fd:00 33689622                   /usr/lib64/libm-2.17.so
2b75f5365000-2b75f537b000 r-xp 00000000 00:2d 4475273610                 /n/sw/helmod/apps/centos7/Core/gcc/9.3.0-fasrc01/lib64/libgcc_s.so.1
2b75f537b000-2b75f557b000 ---p 00016000 00:2d 4475273610                 /n/sw/helmod/apps/centos7/Core/gcc/9.3.0-fasrc01/lib64/libgcc_s.so.1
2b75f557b000-2b75f557c000 r--p 00016000 00:2d 4475273610                 /n/sw/helmod/apps/centos7/Core/gcc/9.3.0-fasrc01/lib64/libgcc_s.so.1
2b75f557c000-2b75f557d000 rw-p 00017000 00:2d 4475273610                 /n/sw/helmod/apps/centos7/Core/gcc/9.3.0-fasrc01/lib64/libgcc_s.so.1
2b75f557d000-2b75fbabf000 r--p 00000000 fd:00 67214576                   /usr/lib/locale/locale-archive
2b75fbabf000-2b75fbac0000 ---p 00000000 00:00 0 
2b75fbac0000-2b75fbcc0000 rw-p 00000000 00:00 0 
2b75fbcc0000-2b75fbcc1000 ---p 00000000 00:00 0 
2b75fbcc1000-2b75fcbcf000 rw-p 00000000 00:00 0 
2b75fcbcf000-2b75fd42f000 r-xp 00000000 00:2d 4474897380                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/sys.so
2b75fd42f000-2b75fd62e000 ---p 00860000 00:2d 4474897380                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/sys.so
2b75fd62e000-2b75fd651000 r--p 0085f000 00:2d 4474897380                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/sys.so
2b75fd651000-2b7604e8a000 rw-p 00882000 00:2d 4474897380                 /n/sw/helmod/apps/centos7/Comp/gcc/9.3.0-fasrc01/julia/1.5.1-fasrc01/lib/julia/sys.so
2b7604e8a000-2b7604ea7000 rw-p 00000000 00:00 0 
2b7604ea7000-2b7608eab000 rw-p 00000000 00:00 0 
2b7608eab000-2b7609217000 rw-p 00000000 00:00 0 
2b7609217000-2b760921f000 ---p 00000000 00:00 0 
2b760921f000-2b760a01a000 rw-p 00000000 00:00 0 
2b760a01a000-2b760a01b000 ---p 00000000 00:00 0 
2b760a01b000-2b760a21b000 rw-p 00000000 00:00 0 
2b760a21b000-2b760a21c000 ---p 00000000 00:00 0 
2b760a21c000-2b760a41c000 rw-p 00000000 00:00 0 
2b760a41c000-2b760a41d000 ---p 00000000 00:00 0 
2b760a41d000-2b760a61d000 rw-p 00000000 00:00 0 
2b760a61d000-2b760a61e000 ---p 00000000 00:00 0 
2b760a61e000-2b760a81e000 rw-p 00000000 00:00 0 
2b760a81e000-2b760a81f000 ---p 00000000 00:00 0 
2b760a81f000-2b760aa1f000 rw-p 00000000 00:00 0 
2b760aa1f000-2b760aa20000 ---p 00000000 00:00 0 
2b760aa20000-2b760ac20000 rw-p 00000000 00:00 0 
2b760ac20000-2b760ac21000 ---p 00000000 00:00 0 
2b760ac21000-2b760ae21000 rw-p 00000000 00:00 0 
2b760ae21000-2b760ae22000 ---p 00000000 00:00 0 
2b760ae22000-2b760b622000 rw-p 00000000 00:00 0 
2b760b622000-2b760b62a000 ---p 00000000 00:00 0 
2b760b62a000-2b760ba22000 rw-p 00000000 00:00 0 
2b760ba22000-2b760ba2a000 ---p 00000000 00:00 0 
2b760ba2a000-2b760bf22000 rw-p 00000000 00:00 0 
2b760c000000-2b760c155000 rw-p 00000000 00:00 0 
2b760c155000-2b7610000000 ---p 00000000 00:00 0 
signal (6): Aborted
in expression starting at REPL[30]:1
gsignal at /lib64/libc.so.6 (unknown line)
abort at /lib64/libc.so.6 (unknown line)
__libc_message at /lib64/libc.so.6 (unknown line)
_int_free at /lib64/libc.so.6 (unknown line)
_ZNK4geos4geom8Geometry19getEnvelopeInternalEv at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so (unknown line)
_ZNK4geos4geom7Polygon23computeEnvelopeInternalEv at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so (unknown line)
_ZNK4geos4geom8Geometry19getEnvelopeInternalEv at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so (unknown line)
_ZN4geos9operation9overlayng11OverlayUtil13isEnvDisjointEPKNS_4geom8GeometryES6_PKNS3_14PrecisionModelE at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so (unknown line)
_ZN4geos9operation9overlayng9OverlayNG9getResultEv at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so (unknown line)
_ZN4geos9operation9overlayng9OverlayNG7overlayEPKNS_4geom8GeometryES6_iPKNS3_14PrecisionModelE at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so (unknown line)
_ZN4geos9operation9overlayng15OverlayNGRobust7OverlayEPKNS_4geom8GeometryES6_i at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so (unknown line)
_ZN4geos4geom16HeuristicOverlayEPKNS0_8GeometryES3_i at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so (unknown line)
_ZNK4geos4geom8Geometry12intersectionEPKS1_ at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos-3.9.0.so (unknown line)
GEOSIntersection_r at /n/home06/nbalasus/.julia/artifacts/514b2d8cdf634f84bc9d53a25f76ec2826d964ce/lib/libgeos_c.so (unknown line)
GEOSIntersection_r at /n/home06/nbalasus/.julia/packages/LibGEOS/JGKwE/src/libgeos_api.jl:353 [inlined]
intersection at /n/home06/nbalasus/.julia/packages/LibGEOS/JGKwE/src/geos_functions.jl:424 [inlined]
intersection at /n/home06/nbalasus/.julia/packages/LibGEOS/JGKwE/src/geos_functions.jl:424
intersection at /n/home06/nbalasus/.julia/packages/LibGEOS/JGKwE/src/geos_operations.jl:72
macro expansion at ./REPL[29]:12 [inlined]
#4#threadsfor_fun at ./threadingconstructs.jl:81
#4#threadsfor_fun at ./threadingconstructs.jl:48
unknown function (ip: 0x2b762c6953bc)
jl_apply at /scratch/pkrastev/lmod_build/julia-1.5.1/src/julia.h:1690 [inlined]
start_task at /scratch/pkrastev/lmod_build/julia-1.5.1/src/task.c:707
unknown function (ip: (nil))
Allocations: 1456768 (Pool: 1456085; Big: 683); GC: 2
Aborted (core dumped)

This error is avoided by removing the Threads.@threads. Additionally, it oddly works when I reduce the number of loops (e.g., changing it to Threads.@threads for j = 1:10). I have 4 cores available.

jaakkor2 commented 2 years ago

I see this crashing as well. Here is a bit simpler example that crashes every time I run it (Julia 1.7.1, GEOS_jll 3.10.0+0, LibGEOS.jl 0.6.8)

using LibGEOS

function findintersecting(g1, g2)
    n1, n2 = length(g1), length(g2)
    polygon_intersection = fill(0.0,n1,n2)
    Threads.@threads for i = 1:n1
        for j = 1:n2
            if intersects(g1[i], g2[j])
                polygon_intersection[i,j] = 1
            end
        end
    end
    sum(polygon_intersection)
end

pnts = 1.0*[[-1,-1],[+1,-1],[+1,+1],[-1,+1]] .+ [0.00*randn(2) for i=1:4]
geom = [[pnts; [pnts[1]]]]
g1 = [LibGEOS.Polygon(geom) for i=1:500]
findintersecting(g1,g1)

If I add some randomness

pnts = 1.0*[[-1,-1],[+1,-1],[+1,+1],[-1,+1]] .+ [0.01*randn(2) for i=1:4]

the crashing still happens, but much more seldomly.

jaakkor2 commented 2 years ago

Even simpler crasher

import LibGEOS
function cra(n)
    p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
    g1 = [LibGEOS.Polygon(p) for i=1:n]
    r = fill(false, n, n)
    Threads.@threads for i=1:n
        for j=1:n
            r[i,j] = LibGEOS.intersects(g1[i],g1[j])
        end
    end
    return r
end
cra(1000)
visr commented 2 years ago

@jaakkor2, about https://github.com/JuliaGeo/LibGEOS.jl/issues/91#issuecomment-1003350071

I haven't tried threading yet, but I thought the idea of the _r functions that take a GEOSContext is that you create a separate context per thread to avoid these issues?

jaakkor2 commented 2 years ago

@jaakkor2, about #91 (comment)

I haven't tried threading yet, but I thought the idea of the _r functions that take a GEOSContext is that you create a separate context per thread to avoid these issues?

Sorry for confusion. It seems that using one global context and threading is asking for trouble.

https://lists.osgeo.org/pipermail/geos-devel/2017-November/008157.html discussion here says that the API is thread-safe, but the implementation is not. And likely Julia interface needs modification..

visr commented 2 years ago

Ah good to know. Would still be interesting to try it out with thread local contexts, to see if any remaining thread safety issues cause problems for examples like this.

jaakkor2 commented 2 years ago

This still crashes

import LibGEOS
ctx = [LibGEOS.GEOSContext() for i=1:Threads.nthreads()]

function cra(n, ctx)
    p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
    g1 = [LibGEOS.Polygon(p) for i=1:n]
    r = fill(false, n, n)
    Threads.@threads for i=1:n
        for j=1:n
            r[i,j] = LibGEOS.intersects(g1[i].ptr, g1[j].ptr, ctx[Threads.threadid()])
        end
    end
    return r
end

cra(500, ctx)
jaakkor2 commented 2 years ago

If Threads.@threads is in front of the inner for loop, then this seems to work 🤷

import LibGEOS
ctx = [LibGEOS.GEOSContext() for i=1:Threads.nthreads()]

function cra(n, ctx)
    p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
    g1 = [LibGEOS.Polygon(p) for i=1:n]
    r = fill(false, n, n)
    for j=1:n
        Threads.@threads for i=1:n
            r[i,j] = LibGEOS.intersects(g1[i].ptr, g1[j].ptr, ctx[Threads.threadid()])
        end
    end
    return r
end

cra(500, ctx)
nicholasbalasus commented 2 years ago

Even simpler crasher

import LibGEOS
function cra(n)
    p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
    g1 = [LibGEOS.Polygon(p) for i=1:n]
    r = fill(false, n, n)
    Threads.@threads for i=1:n
        for j=1:n
            r[i,j] = LibGEOS.intersects(g1[i],g1[j])
        end
    end
    return r
end
cra(1000)

@jaakkor2 The following seems to work using @spawn instead of @threads.

import LibGEOS
function cra(n)
    p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
    g1 = [LibGEOS.Polygon(p) for i=1:n]
    r = fill(false, n, n)
    task = Threads.@spawn for i=1:n
        for j=1:n
            r[i,j] = LibGEOS.intersects(g1[i],g1[j])
        end
    end
    wait(task)
    return r
end
cra(1000)
nicholasbalasus commented 2 years ago

Sorry, the above is not correct and operates on a single core. My apologies, I am quite new to Julia!

goerch commented 2 years ago

If you are testing these please lock access to the shared resource.

goerch commented 2 years ago

From https://trac.osgeo.org/geos/wiki/RFC3

Function names in the new API will be updated with an _r, as is the familiar C standard for reentrant/thread safe versions.

GEOSIntersection_r is used here https://github.com/JuliaGeo/LibGEOS.jl/blob/v0.6.8/src/geos_functions.jl#L423-L429

EDIT: Be careful with using threads.

Unfortunately the contexts don't seem to be published.

visr commented 2 years ago

Hmm good point about locking access to share resources. And you are right that these evaled functions don't include the optional context argument, it would be good to add them. If you want to test the contexts for threading purposes, you can replace intersection(g1, g2) with intersection(g1.ptr, g2.ptr, context) however, as was done in the example above.

yeesian commented 2 years ago

Unfortunately the contexts don't seem to be published.

I've responded in https://discourse.julialang.org/t/improving-performance-on-nested-for-loops-sparsearrays-libgeos/74038/11: I don’t think it was an intentional design for it to remain internal, just that we didn’t implement it all the way through back then.

skygering commented 2 years ago

I was playing around with this and it still crashes even if all of the writing to shared resources is removed. Here is an example that crashes:

function cra(n)
    ctx = [LibGEOS.GEOSContext() for i=1:Threads.nthreads()]
    p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
    g1 = [LibGEOS.Polygon(p) for i=1:n]
    Threads.@threads for i=1:n
        for j=1:n
            LibGEOS.intersects(g1[i].ptr, g1[j].ptr, ctx[Threads.threadid()])
        end
    end
end

cra(1000)

It is the same as the example above, except that it does not write the results to a shared array and creates the context for the threads within the loop.

When debugging with the VSC debugger I get the following error: Screen Shot 2022-09-14 at 2 52 07 PM

I have also gotten the error when I switched thej and the i: Screen Shot 2022-09-14 at 2 57 02 PM

Do we think that this is caused by the C backend? I tried setting a breakpoint in malloc_error_break in VSC as suggested but it isn't breaking...

yeesian commented 2 years ago

@skygering I haven't been able to reproduce any of the crashes on LibGEOS v1.7.2, see below:

  | | |_| | | | (_| |  |  Version 1.8.0 (2022-08-17)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

...

(@v1.8) pkg> st
Status `~/.julia/environments/v1.8/Project.toml`
...
  [cf35fbd7] GeoInterface v1.0.1
  [a90b1aa1] LibGEOS v0.7.2

Can I doublecheck which version of Julia and LibGEOS are you reproducing the crashes on?


For https://github.com/JuliaGeo/LibGEOS.jl/issues/91#issuecomment-1004870593, I get:

  | | |_| | | | (_| |  |  Version 1.8.0 (2022-08-17)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using LibGEOS

julia> function findintersecting(g1, g2)
           n1, n2 = length(g1), length(g2)
           polygon_intersection = fill(0.0,n1,n2)
           Threads.@threads for i = 1:n1
               for j = 1:n2
                   if intersects(g1[i], g2[j])
                       polygon_intersection[i,j] = 1
                   end
               end
           end
           sum(polygon_intersection)
       end
findintersecting (generic function with 1 method)

julia> pnts = 1.0*[[-1,-1],[+1,-1],[+1,+1],[-1,+1]] .+ [0.00*randn(2) for i=1:4]

4-element Vector{Vector{Float64}}:

[-1.0, -1.0]
 [1.0, -1.0]
 [1.0, 1.0]
 [-1.0, 1.0]

julia> geom = [[pnts; [pnts[1]]]]
1-element Vector{Vector{Vector{Float64}}}:
 [[-1.0, -1.0], [1.0, -1.0], [1.0, 1.0], [-1.0, 1.0], [-1.0, -1.0]]

julia> g1 = [LibGEOS.Polygon(geom) for i=1:500]
500-element Vector{Polygon}:
 Polygon(Ptr{Nothing} @0x00006000028030c0)
 Polygon(Ptr{Nothing} @0x0000600002800230)
 Polygon(Ptr{Nothing} @0x00006000028004b0)
 Polygon(Ptr{Nothing} @0x0000600002802c60)
 Polygon(Ptr{Nothing} @0x0000600002800a50)
 Polygon(Ptr{Nothing} @0x00006000028000a0)
 Polygon(Ptr{Nothing} @0x0000600002803d90)
 Polygon(Ptr{Nothing} @0x0000600002800410)
 Polygon(Ptr{Nothing} @0x0000600002801720)
 Polygon(Ptr{Nothing} @0x0000600002801c70)
 â‹®
 Polygon(Ptr{Nothing} @0x000060000281eb20)
 Polygon(Ptr{Nothing} @0x000060000281d5e0)
 Polygon(Ptr{Nothing} @0x000060000281eda0)
 Polygon(Ptr{Nothing} @0x000060000281ee90)
 Polygon(Ptr{Nothing} @0x000060000281e120)
 Polygon(Ptr{Nothing} @0x000060000281caf0)
 Polygon(Ptr{Nothing} @0x000060000281f250)
 Polygon(Ptr{Nothing} @0x000060000281c870)
 Polygon(Ptr{Nothing} @0x000060000281f610)
 Polygon(Ptr{Nothing} @0x000060000281fe30)

julia> findintersecting(g1,g1)
250000.0

For https://github.com/JuliaGeo/LibGEOS.jl/issues/91#issuecomment-1004910824, I get:

  | | |_| | | | (_| |  |  Version 1.8.0 (2022-08-17)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> import LibGEOS

julia> ctx = [LibGEOS.GEOSContext() for i=1:Threads.nthreads()]
1-element Vector{LibGEOS.GEOSContext}:
 LibGEOS.GEOSContext(Ptr{Nothing} @0x000000012f809e00)

julia> function cra(n, ctx)
           p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
           g1 = [LibGEOS.Polygon(p) for i=1:n]
           r = fill(false, n, n)
           Threads.@threads for i=1:n
               for j=1:n
                   r[i,j] = LibGEOS.intersects(g1[i].ptr, g1[j].ptr, ctx[Threads.threadid()])
               end
           end
           return r
       end
cra (generic function with 1 method)

julia> cra(500, ctx)
500×500 Matrix{Bool}:
 1  1  1  1  1  1  1  1  1  1  1  1  1  …  1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  …  1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 ⋮              ⋮              ⋮        ⋱        ⋮              ⋮
 1  1  1  1  1  1  1  1  1  1  1  1  1  …  1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  …  1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1
 1  1  1  1  1  1  1  1  1  1  1  1  1     1  1  1  1  1  1  1  1  1  1  1  1

For https://github.com/JuliaGeo/LibGEOS.jl/issues/91#issuecomment-1247348423, I get:

  | | |_| | | | (_| |  |  Version 1.8.0 (2022-08-17)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> import LibGEOS

julia> function cra(n)
           ctx = [LibGEOS.GEOSContext() for i=1:Threads.nthreads()]
           p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
           g1 = [LibGEOS.Polygon(p) for i=1:n]
           Threads.@threads for i=1:n
               for j=1:n
                   LibGEOS.intersects(g1[i].ptr, g1[j].ptr, ctx[Threads.threadid()])
               end
           end
       end
cra (generic function with 1 method)

julia> cra(1000)

julia>
jaakkor2 commented 2 years ago

Crashes for me with LibGEOS v0.7.2 and GeoInterface v1.0.1 if i start Julia with --threads 4, does not crash if started without threads.

skygering commented 2 years ago

I also have LibGEOS v0.7.2 and GeoInterface v1.0.1. I am also running with 4 threads like @jaakkor2 mentioned. With the last example that I provided, I am crashing every time.

It doesn't seem like v0.7.2 includes the new Polygon updates, which had memory issues, but I don't think it is that given that I did not have to change the code for making Polygons from vectors.

evetion commented 2 years ago

Looking at the source code of LibGEOS, we need to construct geometries unique to a context.

So this works:

function cra(n)
    @info "Using $(Threads.nthreads()) threads"
    out = zeros(n, n)  # don't use falses, it's not thread safe ?!
    ctx = [LibGEOS.GEOSContext() for i=1:Threads.nthreads()]
    p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
    g1 = [LibGEOS.Polygon(p) for i=1:n]
    # Generate a unique per thread/context geometry (!)
    g2 = [LibGEOS.cloneGeom.(getfield.(g1, :ptr), Ref(c)) for c in ctx]
    Threads.@threads for i=1:n
        for j=1:n
            ti = Threads.threadid()
            out[i, j] = LibGEOS.intersects(g2[ti][i], g2[ti][j], ctx[ti])
        end
    end
    sum(out)
end

cra(1000)
# [ Info: Using 8 threads
# 1.0e6
skygering commented 2 years ago

This also works for me! Thank you so much @evetion!

However, it is kind of annoying that we have to make a geometry for each thread though... I guess we shouldn't have a huge number of threads but that doesn't seem the best for performance, especially if we are working with a huge number of polygons. I don't really see a way to create less though right off the bat. This almost seems to get rid of the benefit of threads over @distributed in Julia, since we now have to have a copy of every geometry for each thread. Rather than the threads safely sharing memory, we are just creating a new version of every object in memory to prevent data races.

In fact, this seems to work as well, which does not use contexts but rather just makes nthreads copies of every polygon. Thoughts?

function cra2(n)
    nthreads= Threads.nthreads()
    @info "Using $(nthreads) threads"
    out = zeros(n, n)  # don't use falses, it's not thread safe ?!
    p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
    g1 = [LibGEOS.Polygon(p) for i=1:n]
    g2 = [LibGEOS.cloneGeom.(getfield.(g1, :ptr)) for i in 1:nthreads]
    Threads.@threads for i=1:n
        for j=1:n
            ti = Threads.threadid()
            out[i, j] = LibGEOS.intersects(g2[ti][i], g2[ti][j])
        end
    end
    sum(out)
end

I am also wondering if we should expose the context parameter within constructors like Polygon, since in most cases they calls functions createPolygon (which does accept a context), in addition to functions that take in the geometries like intersects, intersection, etc.

yeesian commented 2 years ago

I am also wondering if we should expose the context parameter within constructors like Polygon, since in most cases they calls functions createPolygon (which does accept a context), in addition to functions that take in the geometries like intersects, intersection, etc.

Yeah the current state wasn't by design, so I think that will be very welcome :)

skygering commented 2 years ago

I have started the PR to expose context within the functions and types :)

However, I am still not convinced on the use of contexts as used above. It does stop the errors, but as I showed above this doesn't actually have to do with contexts, and rather with having a separate copy of each geometry used within each thread. This seems really inefficent. This might be the design of the underlying library, but then it seems like it really isn't thread-safe. They have just eliminated all possible race conditions by creating "separate" memory for each thread. Thoughts?

yeesian commented 2 years ago

Yeah, unfortunately it seems that way.. https://www.mail-archive.com/geos-devel@lists.osgeo.org/msg05011.html

skygering commented 2 years ago

To keep this conversation going, I am also having issues with distributed computing using Julia's @distributed and getting seg faults.

For example:

using Distributed
addprocs(3)
@everywhere using LibGEOS
coord = [[[0.,0.],[+1.,0.],[+1.,+1.],[0.,+1.],[0.,0.]]]
coord_lst = [coord for i=1:5]

@distributed (+) for c in coord_lst
    rect = LibGEOS.Polygon(c)
    LibGEOS.area(rect)
end

gives 5 as expected.

However,

coord = [[[0.,0.],[+1.,0.],[+1.,+1.],[0.,+1.],[0.,0.]]]
coord_lst = [coord for i=1:5]
poly_lst = [LibGEOS.Polygon(c) for c in coord_lst]

@distributed (+) for rect in poly_lst
    LibGEOS.area(rect)
end

gives a seg fault. Here is the beginning of the error message. It is really long.

Screen Shot 2022-09-19 at 3 59 20 PM

The stack trace at the bottom of the error message says that there is an unsafe read: Screen Shot 2022-09-19 at 4 01 34 PM

I am a bit surprised by this error given that area seems like it should be a read only function and each polygon is in its own process.

Luckily I think that making the polygons within each process actually works for what I need to do, but I thought I would bring it up.

jaakkor2 commented 2 years ago

In the example, coord_lst is sharing the memory of coord. With this it seems to work coord_lst = [copy(coord) for i=1:5].

skygering commented 2 years ago

I am a bit confused on where the memory is being freed. I tried your way @jaakkor2 and that does work! However, if I get rid of coord_lst entirely, I still have the same problem. For example,

using Distributed
addprocs(3)
@everywhere using LibGEOS

coord = [[[0.,0.],[+1.,0.],[+1.,+1.],[0.,+1.],[0.,0.]]]
poly_lst = [LibGEOS.Polygon(coord) for i in 1:5]
@distributed for p in poly_lst
    LibGEOS.area(p)
end

also throws a seg fault. I feel like this suggests that the ownership of coord is transferred to each polygon, and then when one of the polygons is cleaned up, this is also getting rid of coord somehow. I don't see how else this would be causing a problem given that coord seems like it should be read-only.

However, if this is the case than @yeesian and my thought process that making Polygons (or any other Geom type) out of a vector of coordinates does not require any copying/cloning might be wrong. This would require an update of those constructors.

However, I also made a little example that "translates" the coordinates. When using the translated coordinates I am still getting a seg fault...

using Distributed
addprocs(3)
@everywhere using LibGEOS

function translate(coords, centroid)
    return coords .+ repeat([centroid], length(coords))
end

coord = [[[0.,0.],[+1.,0.],[+1.,+1.],[0.,+1.],[0.,0.]]]
coord_lst = [[translate(coord[1], [0.0, 2.0*i])] for i=0:5]
poly_lst = [LibGEOS.Polygon(c) for c in coord_lst]
@distributed for p in poly_lst
    LibGEOS.area(p)
end

I checked that all of the elements of coord_lst are in distinct places in memory using the following:

[pointer(c) for c in coord_lst]

I can't for the life of me see how this is different from @jaakkor2's solution above, which does not give me a seg fault.

If anyone sees a flaw in my logic please let me know. I am trying hard to get a hang of Julia and Distributed and these memory problems are tricky!! 😄 I appreciate everyone's help and patience!

nraynaud commented 1 year ago

I just tried a test on my mac from @skygering 's exemple.

@testset "distributed" begin
    @show "in distributed"
    function f(n)
        @show Threads.nthreads()
        ctx = [LibGEOS.GEOSContext() for i=1:Threads.nthreads()]
        p = [[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]]
        g1 = [LibGEOS.Polygon(p) for i=1:n]
        Threads.@threads for i=1:n

            @async  println("$(Threads.threadid())")
            for j=1:n
                LibGEOS.intersects(g1[i].ptr, g1[j].ptr, ctx[Threads.threadid()])
            end
        end
        GC.gc(true)
        return nothing
    end
    f(100)
    f(10)
    LibGEOS.Polygon([[[-1.,-1],[+1,-1],[+1,+1],[-1,+1],[-1,-1]]])
    nothing
end

I got a lucky lead once (it never happened again, I mostly had double free or invalid checksums with other runs)

__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
pthread_kill at /usr/lib/system/libsystem_pthread.dylib (unknown line)
abort at /usr/lib/system/libsystem_c.dylib (unknown line)
malloc_vreport at /usr/lib/system/libsystem_malloc.dylib (unknown line)
malloc_zone_error at /usr/lib/system/libsystem_malloc.dylib (unknown line)
_ZNK4geos4geom8Geometry19getEnvelopeInternalEv at /Users/nraynaud/.julia/artifacts/c9282169ba3bbd659f0a2b0d8de61805058bd3b3/lib/libgeos.3.11.0.dylib (unknown line)
_ZNK4geos4geom7Polygon23computeEnvelopeInternalEv at /Users/nraynaud/.julia/artifacts/c9282169ba3bbd659f0a2b0d8de61805058bd3b3/lib/libgeos.3.11.0.dylib (unknown line)
_ZNK4geos4geom8Geometry19getEnvelopeInternalEv at /Users/nraynaud/.julia/artifacts/c9282169ba3bbd659f0a2b0d8de61805058bd3b3/lib/libgeos.3.11.0.dylib (unknown line)
_ZNK4geos4geom8Geometry10intersectsEPKS1_ at /Users/nraynaud/.julia/artifacts/c9282169ba3bbd659f0a2b0d8de61805058bd3b3/lib/libgeos.3.11.0.dylib (unknown line)
GEOSIntersects_r at /Users/nraynaud/.julia/artifacts/c9282169ba3bbd659f0a2b0d8de61805058bd3b3/lib/libgeos_c.1.17.0.dylib (unknown line)
GEOSIntersects_r at /Users/nraynaud/.julia/dev/LibGEOS/src/libgeos_api.jl:985 [inlined]
intersects at /Users/nraynaud/.julia/dev/LibGEOS/src/geos_functions.jl:716 [inlined]
macro expansion at /Users/nraynaud/.julia/dev/LibGEOS/test/test_distributed.jl:15 [inlined]
#72#threadsfor_fun#3 at ./threadingconstructs.jl:84
#72#threadsfor_fun at ./threadingconstructs.jl:51 [inlined]
#1 at ./threadingconstructs.jl:30
unknown function (ip: 0x10764ec8f)
ijl_apply_generic at /Users/nraynaud/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
start_task at /Users/nraynaud/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
Allocations: 721899 (Pool: 721221; Big: 678); GC: 1

I went down the rabbit hole for a bit.

The short of it is that reading is not safe, the envelope of the geometry is lazily computed: https://github.com/libgeos/geos/blob/main/src/geom/Geometry.cpp#L234 The initial value is null: https://github.com/libgeos/geos/blob/main/src/geom/Geometry.cpp#L109

That probably means that geometries have to live in a single thread or have a very deliberate handover between threads.

nraynaud commented 1 year ago

@skygering , I tried this code:


using Distributed
@testset "distributed" begin
    addprocs(3)
    @everywhere using LibGEOS

    function translate(coords, centroid)
        return coords .+ repeat([centroid], length(coords))
    end

    coord = [[[0.0, 0.0], [+1.0, 0.0], [+1.0, +1.0], [0.0, +1.0], [0.0, 0.0]]]
    coord_lst = [[translate(coord[1], [0.0, 2.0 * i])] for i = 0:5]
    poly_lst = [LibGEOS.Polygon(c) for c in coord_lst]
    @show [p.ptr for p in poly_lst]
    @distributed for p in poly_lst
        @show p.ptr
        LibGEOS.area(p)
    end
end

I got some signal 11 (a bunch of them).

After going down the rabbit hole (I had never heard of threading or distribution in julia before today), I found an explanation out of left field.

The Polygon struct has been serialized on the TCP socket, and the serialization rule sets the Ptr-typed fields to NULL. https://docs.julialang.org/en/v1/stdlib/Serialization/#Serialization.serialize

The read-back value will be as identical as possible to the original, but note that Ptr values are serialized as all-zero bit patterns (NULL).

Here is worker 2's isolated output:

      From worker 2:
      From worker 2:    signal (11): Segmentation fault: 11
      From worker 2:    in expression starting at none:0
      From worker 2:    GEOSArea_r at /Users/nraynaud/.julia/artifacts/c9282169ba3bbd659f0a2b0d8de61805058bd3b3/lib/libgeos_c.1.17.0.dylib (unknown line)
      From worker 2:    GEOSArea_r at /Users/nraynaud/.julia/dev/LibGEOS/src/libgeos_api.jl:1644 [inlined]
      From worker 2:    geomArea at /Users/nraynaud/.julia/dev/LibGEOS/src/geos_functions.jl:1324 [inlined]
      From worker 2:    area at /Users/nraynaud/.julia/dev/LibGEOS/src/geos_operations.jl:335 [inlined]
      From worker 2:    area at /Users/nraynaud/.julia/dev/LibGEOS/src/geos_operations.jl:335
      From worker 2:    macro expansion at /Users/nraynaud/.julia/dev/LibGEOS/test/test_distributed.jl:15 [inlined]
      From worker 2:    #3 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Distributed/src/macros.jl:303
      From worker 2:    #178 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Distributed/src/macros.jl:83
      From worker 2:    unknown function (ip: 0x102a4dc0f)
      From worker 2:    ijl_apply_generic at /Users/nraynaud/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
      From worker 2:    jl_f__call_latest at /Users/nraynaud/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
      From worker 2:    #invokelatest#2 at ./essentials.jl:729 [inlined]
      From worker 2:    invokelatest at ./essentials.jl:726
      From worker 2:    unknown function (ip: 0x102a426c2)
      From worker 2:    ijl_apply_generic at /Users/nraynaud/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
      From worker 2:    do_apply at /Users/nraynaud/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
      From worker 2:    #107 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:281
      From worker 2:    run_work_thunk at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:70
      From worker 2:    unknown function (ip: 0x102a4d580)
      From worker 2:    ijl_apply_generic at /Users/nraynaud/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
      From worker 2:    run_work_thunk at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:79
      From worker 2:    #100 at ./task.jl:484
      From worker 2:    unknown function (ip: 0x102a4d07f)
      From worker 2:    ijl_apply_generic at /Users/nraynaud/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
      From worker 2:    start_task at /Users/nraynaud/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
      From worker 2:    Allocations: 6744733 (Pool: 6741750; Big: 2983); GC: 8

My suspicions were arounsed when I saw that my C stacktrace was stopping at GEOSArea_r and not getting into the C++.

Later I spotted the pointers were all 0x0000000000000000 in the loop:

[p.ptr for p = poly_lst] = Ptr{Nothing}[Ptr{Nothing} @0x00007f8377fa4b10, Ptr{Nothing} @0x00007f8330029fe0, Ptr{Nothing} @0x00007f83988672c0, Ptr{Nothing} @0x00007f8377ff44c0, Ptr{Nothing} @0x00007f8377f43570, Ptr{Nothing} @0x00007f833000c830]
Test Summary: |Time
LibGEOS       | None  5.3s
      From worker 3:    p.ptr = Ptr{Nothing} @0x0000000000000000
      From worker 2:    p.ptr = Ptr{Nothing} @0x0000000000000000
      From worker 4:    p.ptr = Ptr{Nothing} @0x0000000000000000
skygering commented 1 year ago

@nraynaud Thank you for looking into this! So what I am getting from this is that LibGEOS Geometry objects are not made to exist between threads or even between processes.

Confirming what you said above, I did find that Julia's distributed processing uses TCP/IP sockets be default: https://gensoft.pasteur.fr/docs/julia/1.4.1/stdlib/Distributed.html#Cluster-Manager-Interface-1

TCP/IP sockets are used to connect and transport messages between processes. It is possible for Cluster Managers to provide a different transport.

So maybe if you want to be able to do this there might be a way to set up a cluster manager that sends data in a way where the pointers aren't being set to Null.

Disappointing that it isn't the default behavior, but it doesn't seem to be a problem with LibGEOS, but rather that explicit pointers are needed when working with GEOS C API and those don't work well with default Distributed processing behavior. Would you agree with that assessment?

Here is a discourse question of someone else having the same problem: https://discourse.julialang.org/t/passing-pointers-to-workers-process-results-in-null-pointers/73418

...when a pointer Ptr{} is sent to a remote process, it is always set to NULL - 0x000000000000000, which makes the programs crash. Does anybody have a workaround for this issue?

I don't understand the work around suggested here, but maybe someone in the community has found a solution. I will ask on the slack.

skygering commented 1 year ago

After asking on Slack, I realized that it isn't possible to share pointers between processes because they don't share the same memory space. This is why they are set to NULL. Since the LibGEOS objects are just pointers, I don't see a way to fix this problem. Maybe others have ideas though?

When using @distributed, it would seem that we need to make the polygons within each process. Since we concluded above that we need a copy of each geometry per each thread, or we need to make the geometries within threads, it seems that there is no way to parallelize LibGEOS operations on a pre-existing set of geometries that is shared or passed between threads/processes.

@yeesian Do you know if ArchGDAL.jl has these problems? It seems similarly setup so I would assume it has the same distributed problems due to the pointers, but do you know if it is thread-safe? ArchGDAL and LibGEOS seem to have very different uses though.

nraynaud commented 1 year ago

@skygering to confirm what people told you, you can also share slabs of memory between processes, but the objects have to be allocated in the slab (which certainly doesn't happen by a random malloc in the middle of an algorithm, allocators would have to be passed around), and you need some kind of policy as to what the processes will do with the memory.

The path to distributed computing with libGEOS seems to be quite narrow. I guess the ideal application with those constraints would have a serialization/deserialization time significantly smaller than its computiing time. Which would probably work as well with arrays of coordinates sent over the socket and a completely thread oblivious GEOS.

I'm sorry. You're not alone, tho, I looked into the issue because my work wanted to use multithreading with LibGEOS. We'll have to add some significant checks to ensure that objects don't cross the wrong context (including probably storing the context next to the pointer and checking it in the finalizer).

yeesian commented 1 year ago

Do you know if ArchGDAL.jl has these problems? It seems similarly setup so I would assume it has the same distributed problems due to the pointers, but do you know if it is thread-safe? ArchGDAL and LibGEOS seem to have very different uses though.

Yeah ArchGDAL (and GDAL more generally) is most often used for reading / writing geospatial data files. It does come compiled with GEOS support, but it won't be superseding what is achievable with GEOS. If the question is about the thread safety of I/O (and not the geometry operations), it has caveats based on GDAL limitations.

(I guess the more directly useful answer is: "As most C/C++ libraries, an object is not thread-safe unless it is explicitly mentioned it is".)