visr / LasIO.jl

Julia package for reading and writing the LAS lidar format.
Other
22 stars 13 forks source link

Core dump inside LasIO #30

Closed greghislop closed 4 years ago

greghislop commented 4 years ago

First of all thank you for all your work on what is a useful package! I'm using it somewhat intensely. Unfortunately I'm getting an intermittent core dump failure occurring within the package while garbage collecting. An example stack trace is below. I'm running julia 1.2.0 on a ubuntu 14.04 machine. In sudo code I'm doing the following within a julia program

  1. create an empty point cloud pc
  2. Loop through the below for up to 9 point cloud files
    • download from cloud a laz version 1.4 file
    • use lastools to downgrade to version 1.3
    • load las file using LasIO
    • extract a small section of the points and add to pc
  3. calculate some features from pc repeat 1-3 many times for different laz files

The code may run for thousands of iterations without failure and then suddenly fail when loading a laz file or it may fail on the first iteration. If it fails once and I restart the program the chance of failure increases. The longer it runs the less likely a failure becomes.

Any help you might be able to give would be appreciated. Unfortunately to date I've been unable to reproduce this error in a reliable fashion.

GC error (probable corruption) : Allocations: 256134561 (Pool: 256090359; Big: 44202); GC: 387

!!! ERROR in jl_ -- ABORTING !!! 0x7f43ceb97010: r-- Stack frame 0x7f43ce994568 -- 16 of 159 (direct) 0x7f43ceb97038: `- Array in object 0x7f43eceed960 :: 0x7f43ef36b553 -- [0x75682a8, 0x756a0a0) of type Array{String, 2}

signal (6): Aborted in expression starting at REPL[3]:1 __libc_signal_restore_set at /build/glibc-OTsEL5/glibc-2.27/signal/../sysdeps/unix/sysv/linux/nptl-signals.h:80 [inlined] raise at /build/glibc-OTsEL5/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:48 abort at /build/glibc-OTsEL5/glibc-2.27/stdlib/abort.c:79 gc_assert_datatype_fail at /buildworker/worker/package_linux64/build/src/gc.c:1531 gc_mark_loop at /buildworker/worker/package_linux64/build/src/gc.c:2419 _jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:2729 jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:2912 jl_gc_pool_alloc at /buildworker/worker/package_linux64/build/src/gc.c:1111 read at ./refvalue.jl:8 [inlined] read at ./none:0 unknown function (ip: 0x7f43c6e40b90) jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2191 load at /home/greg/.julia/packages/LasIO/TCFYi/src/fileio.jl:36

9 at /home/greg/.julia/packages/LasIO/TCFYi/src/fileio.jl:65 [inlined]

open at ./process.jl:681 load at /home/greg/.julia/packages/LasIO/TCFYi/src/fileio.jl:64 jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2191 jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1614 [inlined] jl_fapply at /buildworker/worker/package_linux64/build/src/builtins.c:563 jl_fapply_latest at /buildworker/worker/package_linux64/build/src/builtins.c:601

invokelatest#1 at ./essentials.jl:790 [inlined]

invokelatest at ./essentials.jl:789 [inlined]

load#27 at /home/greg/.julia/packages/FileIO/I1ONY/src/loadsave.jl:184

load at /home/greg/.julia/packages/FileIO/I1ONY/src/loadsave.jl:169 jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2191

load#13 at /home/greg/.julia/packages/FileIO/I1ONY/src/loadsave.jl:118

load at /home/greg/.julia/packages/FileIO/I1ONY/src/loadsave.jl:118 [inlined]

load_las#91 at /home/greg/.julia/packages/RoamesGeometry/V2mGP/src/pointcloud_io.jl:261

load_las at ./none:0 [inlined]

load_pointcloud#66 at /home/greg/.julia/packages/RoamesGeometry/V2mGP/src/pointcloud_io.jl:7

load_pointcloud at ./none:0 [inlined]

macro expansion at /home/greg/CatenaryStatistics/src/Utils.jl:183 [inlined] macro expansion at /home/greg/.julia/packages/Retry/0jMye/src/repeat_try.jl:192 [inlined] get_pointcloud at /home/greg/CatenaryStatistics/src/Utils.jl:181 processTile at /home/greg/CatenaryStatistics/src/points_on_catenaries.jl:280 unknown function (ip: 0x7f43c6e1a2a3) jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2191 processCampaign at /home/greg/CatenaryStatistics/src/points_on_catenaries.jl:406 processCampaign at /home/greg/CatenaryStatistics/src/points_on_catenaries.jl:328 [inlined] processCampaign at /home/greg/CatenaryStatistics/src/points_on_catenaries.jl:328 unknown function (ip: 0x7f43c6de2fac) jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2197 do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:323 eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:411 eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:362 [inlined] eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:772 jl_interpret_toplevel_thunk_callback at /buildworker/worker/package_linux64/build/src/interpreter.c:884 unknown function (ip: 0xfffffffffffffffe) unknown function (ip: 0x7f43ecad6b8f) unknown function (ip: (nil)) jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:893 jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:815 jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:764 jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:844 eval at ./boot.jl:330 jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2191 eval_user_input at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/REPL/src/REPL.jl:86 run_backend at /home/greg/.julia/packages/Revise/0KQ7U/src/Revise.jl:1033

85 at ./task.jl:268

jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2197 jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1614 [inlined] start_task at /buildworker/worker/package_linux64/build/src/task.c:596 unknown function (ip: 0xffffffffffffffff) Allocations: 256134561 (Pool: 256090359; Big: 44202); GC: 387 Aborted (core dumped)

c42f commented 4 years ago

Hi Greg. The message here and stack trace indicates memory corruption. It looks like the GC is trying to garbage collect a value where the type tag is corrupted and points to invalid memory. But this doesn't give a lot to go on and it could be nothing to do with LasIO.

If you could provide a complete program and associated data someone might be able to help. Especially if you can restrict the program to only using LasIO. Another option is to try with a more recent julia version (1.3 or 1.4 RC1) and see whether the problem persists. You never know, it could be a julia bug which has been fixed since 1.2.

greghislop commented 4 years ago

Thanks Chris. That's basically the answer I expected. I'm already trying to narrow down to a code base I can share (it takes a while as errors can be really rare at times) and I agree there is a good chance it's not LASIO. I didn't think of updating the julia version to be honest I thought I was at he latest, so I'll try that too.

c42f commented 4 years ago

Cool, maybe we close this then. If you have a really minimal reproduction you could open an issue in julialang/julia or for something larger maybe try discourse.

Of course if it turns out to involve LasIO please do report back here.

greghislop commented 4 years ago

Thanks Chris

On Sat., 25 Jan. 2020, 1:47 pm Chris Foster, notifications@github.com wrote:

Cool, maybe we close this then. If you have a really minimal reproduction you could open an issue in julialang/julia or for something larger maybe try discourse.

Of course if it turns out to involve LasIO please do report back here.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/visr/LasIO.jl/issues/30?email_source=notifications&email_token=AIR5UCD6S62R6WXFTYFC6FLQ7OY3TA5CNFSM4KLBF7I2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ4UP6I#issuecomment-578373625, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIR5UCD6VHHPU66B6WMZY6DQ7OY3TANCNFSM4KLBF7IQ .

greghislop commented 4 years ago

Problem solved and yes it was unrelated to LASIO. Sorry for the distraction and thanks Chris. https://discourse.julialang.org/t/is-there-a-way-to-find-out-free-disk-space-equivalent-to-psutil-disk-usage-in-python/16918/2