JuliaGeometry / Triangulate.jl

Julia Wrapper for the Triangle Mesh Generator
MIT License
38 stars 11 forks source link

Read-only memory error on Windows #15

Closed simonp0420 closed 2 years ago

simonp0420 commented 2 years ago

I am frequently encountering this error when I run Triangulate on Windows. I've seen it on both Julia 1.7.x and 1.8.x. The same cases on Linux work fine. I notice that this occurred during one of your CI tests (see here) and I wanted you to know that this isn't just an isolated example.

j-fu commented 2 years ago

Very strange, I've never seen this before (that is why I didn't pay attention after merging the PR which was unrelated to code). Re-ran the CI test and it does not occur...

May be related to memory size issues ? I found https://github.com/JuliaLang/julia/issues/38556 ...

But the example is really small, and the Triangle code is quite lean...

What is your Windows version ? Can you say more about the context ?

j-fu commented 2 years ago

again...

simonp0420 commented 2 years ago

It usually occurs when I run Triangulate more than once. My use case is embedded in code in PSSFSS. I will try to come up with a MWE later today.

j-fu commented 2 years ago

Ok, on windows/nightly, test failure is stable and after activating my Windows VM I can see this on my computer as well now. Tests work on 1.8.2, but fail on 1.9-dev.

While I would like to shift the blame onto some new stuff in Julia, it seems it is still too early for that - TetGen tests work well, and they also use ccall...

Curious about your MWE and behavior on 1.8.2.

Debugging this may take some time - I need to make an instrumented version of Triangle_jll for testing.

j-fu commented 2 years ago

There is at least one culprit: https://github.com/libigl/triangle/pull/1

simonp0420 commented 2 years ago

I'm still working on generating a MWE. It might end up just being a WE, not so M, if you get my drift ;-). Re your culprit: hasn't it already been merged?

simonp0420 commented 2 years ago

I can't seem to create a small example. In fact, I can't seem to create an example that is reliably reproducible on Julia 1.8.2. But for what it's worth, here is the output generated by including two of the files in this gist earlier today (I can't get it to error again, though):

julia> versioninfo()
Julia Version 1.8.2
Commit 36034abf26 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 8 on 8 virtual cores
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 8

julia> include("joyal2012_hybrid.jl")
Beginning PSSFSS Analysis
Progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:27
8 knots. Added 20.0 GHz, maxerrdB = -Inf

julia> include("li2009.jl")
L2 = 2.0
L2 = 4.9
L2 = 7.800000000000001
L2 = 10.700000000000001
L2 = 13.600000000000001
L2 = 16.5

julia> include("joyal2012_hybrid.jl")
Beginning PSSFSS Analysis
Progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:21
8 knots. Added 20.0 GHz, maxerrdB = -Inf

julia> include("li2009.jl")
L2 = 2.0
L2 = 4.9
L2 = 7.800000000000001
L2 = 10.700000000000001
L2 = 13.600000000000001
ERROR: ReadOnlyMemoryError()
Stacktrace:
 [1] triangulate(triangle_switches::String, ctio_in::Triangulate.CTriangulateIO, ctio_out::Triangulate.CTriangulateIO, vor_out::Triangulate.CTriangulateIO)
   @ Triangulate C:\Users\peter\.julia\packages\Triangulate\vFrLw\src\ctriangulateio.jl:125
 [2] triangulate(switches::String, tri_in::Triangulate.TriangulateIO)
   @ Triangulate C:\Users\peter\.julia\packages\Triangulate\vFrLw\src\triangulateio.jl:438
 [3] meshsub(; points::Matrix{Float64}, seglist::Matrix{Int32}, segmarkers::Vector{Int32}, holes::Matrix{Float64}, area::Float64, ntri::Int64, switches::String)
   @ PSSFSS.Meshsub d:\peter\Documents\julia\packages\PSSFSS\src\Meshsub.jl:75
 [4] polyring(; s1::Vector{Int64}, s2::Vector{Int64}, a::Vector{Float64}, b::Vector{Float64}, sides::Int64, ntri::Int64, units::Unitful.FreeUnits{(mm,), 𝐋, nothing} , orient::Int64, kwarg::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ PSSFSS.Elements d:\peter\Documents\julia\packages\PSSFSS\src\Elements.jl:1051
 [5] top-level scope
   @ D:\peter\Documents\julia\packages\PSSFSS\sandbox\memoryerror_demo\li2009.jl:20

Note that I had to include each file twice to manifest the error. I thought that the file in the gist named showerror.jl would reliably exhibit the error, but it doesn't. Perhaps I should edit my original post above and change "frequently" to "occasionally". I do note that my CI run for PSSFSS is consistently showing this error on Julia nightly (1.9).

j-fu commented 2 years ago

Thanks for the WE - surely I understand the drift, no problem with that.

Ok this is consistent with what I see in the moment. There is one critical hack in the Triangle code (acknowledged by Shewchuk in the comments) which may cease to work in the case where a pointer does not fit into an unsigned long.

The libigl PR is one way to solve this in the case where unsigned long is 4 bytes. If that were still the case on Windows IMHO there would have been no way that Triangle_jll.jll worked before.

In the moment, I use the original Triangle with some patches for the Julia wrapper, notably replacing exit() statements with error returns and providing a way to pass a sizing callback written in Julia.

Between 1.8 and 1.9 LLVM was updated from 13.x to 14.x. So there may be a different way now to handle bad memory accesses. This is my current hypothesis.

As much as I would like to go on with this immediately, I am bound to other commitments, so it will take a couple of days to solve this.

Can you remember the error message when you had crashes with Julia prior to 1.9 ? Was it ReadOnlyMemoryError ?

simonp0420 commented 2 years ago

Yes, it has consistently been ReadOnlyMemoryError.

j-fu commented 2 years ago

See #16 for attempts fo figure and fix.

simonp0420 commented 2 years ago

I very much appreciate your efforts to fix this. I wish I could help, but I'm completely ignorant of the C language.

j-fu commented 2 years ago

... well, I depend on Triangle working on windows, too - some of my students are on windows.

In fact it seems that the problem is indeed the one I mentioned as first possible culprit. It is puzzling now the it ever worked on windows, and it is surprising that there were only occasional crashes with earlier versions of Julia...

Will need some time for proper testing and making a PR to Yggdrasil (the repo for the binarybuilder recipes).

j-fu commented 2 years ago

I think it has been solved by #16 . Started registration of version 2.2.0 which shall show up in approx. 30min.

j-fu commented 2 years ago

Ok should be there now.

simonp0420 commented 2 years ago

Indeed! CI succeeds now on nightly and I'm unable to manifest the error on Julia 1.8. Thanks!!