apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.55k stars 3.54k forks source link

[C++][Gandiva] Re-enable Gandiva JNI tests and fix Travis CI failure #22877

Closed asfimport closed 5 years ago

asfimport commented 5 years ago

This seems to happen more or less frequently on the Python - Java build (with jpype enabled). See warnings and errors starting from https://travis-ci.org/apache/arrow/jobs/583069089#L6662

 

Additional info:

JVM crash happens on Ubuntu 16.04 when cpp lib is built with Mimalloc allocator instead of jemalloc. Below is the stacktrace from core dump:

(gdb) bt #0 0x00007fbb13ed3428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x00007fbb13ed502a in __GI_abort () at abort.c:89 #2 0x00007fbb131d7149 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #3 0x00007fbb1338ad27 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #4 0x00007fbb131e0e4f in JVM_handle_linux_signal () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #5 0x00007fbb131d3e48 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #6 <signal handler called> #7 mi_page_free_list_extend (heap=0x0, page=0x7fbb133de221, extend=140440661634032, stats=0x7fbae3bfac00) ` at /home/prudhvi/arrow/cpp-build/mimalloc_ep-prefix/src/mimalloc_ep/src/page.c:449 #8 0x00007fbaaedff652 in _mi_segment_page_of (segment=0x7fbaaedff652 <_mi_segment_page_of+18>, p=0x7fbae3bfab30) \ at /home/prudhvi/arrow/cpp-build/mimalloc_ep-prefix/src/mimalloc_ep/include/mimalloc-internal.h:232 \#9 0x00007fbaaedff7bb in mi_heap_malloc_zero_aligned_at (heap=0x7fbaaedff652 <_mi_segment_page_of+18>, size=140440661633840, alignment=140439800379296, \ offset=139646092684112, zero=187) at /home/prudhvi/arrow/cpp-build/mimalloc_ep-prefix/src/mimalloc_ep/src/alloc-aligned.c:31 \#10 0x00007fbaaedff7e0 in mi_heap_malloc_zero_aligned_at (heap=0x7fbab069f7a0 <_mi_heap_empty>, size=139642473343568, alignment=140439774558139, \ offset=140440661633872, zero=186) at /home/prudhvi/arrow/cpp-build/mimalloc_ep-prefix/src/mimalloc_ep/src/alloc-aligned.c:33 #11 0x00007fbaaee00941 in mi_option_init (desc=0x7fbaaedff652 <_mi_segment_page_of+18>) \ at /home/prudhvi/arrow/cpp-build/mimalloc_ep-prefix/src/mimalloc_ep/src/options.c:204 #12 0x00007fbb13ed7ff8 in run_exit_handlers (status=1, listp=0x7fbb142625f8 <exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82 #13 0x00007fbb13ed8045 in __GI_exit (status=) at exit.c:104 #14 0x00007fbb12f76a7c in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #15 0x00007fbb13391587 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #16 0x00007fbb1338ede7 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #17 0x00007fbb133900cf in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #18 0x00007fbb133905f2 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #19 0x00007fbb131d6102 in ?? () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so #20 0x00007fbb1386a6ba in start_thread (arg=0x7fbae3bfb700) at pthread_create.c:333 #21 0x00007fbb13fa541d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109`

 

Reporter: Antoine Pitrou / @pitrou Assignee: Prudhvi Porandla / @pprudhvi

PRs and other links:

Note: This issue was originally created as ARROW-6509. Please see the migration documentation for further details.

asfimport commented 5 years ago

Antoine Pitrou / @pitrou: @xhochy

asfimport commented 5 years ago

Antoine Pitrou / @pitrou: [~fan_li_ya]

asfimport commented 5 years ago

Uwe Korn / @xhochy: We should simply skip tests in the Java build that is done in the Python job. They consume precious runtime for something that is tested in another job already.

asfimport commented 5 years ago

Antoine Pitrou / @pitrou: It seems like that is testing precisely the Gandiva Java bridge? Is it tested in another job too?

asfimport commented 5 years ago

Uwe Korn / @xhochy: Oh, I wasn't aware of that :( I only thought of pyarrow.jvm

asfimport commented 5 years ago

Wes McKinney / @wesm: Issue resolved by pull request 5370 https://github.com/apache/arrow/pull/5370

asfimport commented 5 years ago

Jacques Nadeau / @jacques-n: @pprudhvi, per discussion offline, can you look to solve this? 

asfimport commented 5 years ago

Prudhvi Porandla / @pprudhvi: @jacques-n Yes, I'm working on it

asfimport commented 5 years ago

Wes McKinney / @wesm: Issue resolved by pull request 5417 https://github.com/apache/arrow/pull/5417