opendistro-for-elasticsearch / k-NN

🆕 A machine learning plugin which supports an approximate k-NN search algorithm for Open Distro.
https://opendistro.github.io/
Apache License 2.0
277 stars 56 forks source link

Main PID: 22358 (code=killed, signal=ABRT) #172

Closed pawanm09 closed 4 years ago

pawanm09 commented 4 years ago

I tried to use the latest Amazon AMI version => Open Distro for Elasticsearch-1.9.0-07/09/20-20.35.18 (ami-005fc30a1794cc9f4)

I created a new index and everything was good until I started indexing the documents.

After indexing the first 15 docs.

I got this error:

` elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; disabled; vendor preset: disabled)
   Active: failed (Result: signal) since Mon 2020-07-20 20:54:23 UTC; 32min ago
     Docs: https://www.elastic.co
  Process: 22358 ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/elasticsearch.pid --quiet
 (code=killed, signal=ABRT)
 Main PID: 22358 (code=killed, signal=ABRT)

Logs of /var/log/elasticsearch/hs_err_pid22358.log:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00007f3f841b11f1, pid=22358, tid=23335
#
# JRE version: OpenJDK Runtime Environment AdoptOpenJDK (14.0.1+7) (build 14.0.1+7)
# Java VM: OpenJDK 64-Bit Server VM AdoptOpenJDK (14.0.1+7, mixed mode, sharing, tiered, compressed oops, g1 gc, l
inux-amd64)
# Problematic frame:
# C  [libKNNIndexV1_7_3_6.so+0x1381f1]  similarity::Hnsw<float>::CreateIndex(similarity::AnyParams const&)+0x5f1
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" b
efore starting Java again
#
# If you would like to submit a bug report, please visit:
#   https://github.com/AdoptOpenJDK/openjdk-support/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.

---------------  S U M M A R Y ------------

Command Line: -Xshare:auto -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysP
reTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -XX:
+ShowCodeDetailsInExceptionMessages -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recyc
ler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disab
le.jmx=true -Djava.locale.providers=SPI,COMPAT -Xms1g -Xmx1g -XX:+UseG1GC -XX:G1ReservePercent=25 -XX:InitiatingHe
apOccupancyPercent=30 -Djava.io.tmpdir=/tmp/elasticsearch-10239003110006866815 -XX:+HeapDumpOnOutOfMemoryError -XX
:HeapDumpPath=/var/lib/elasticsearch -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,
safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Dclk.tck=100 -Djdk.attach
.allowAttachSelf=true -Djava.security.policy=file:///usr/share/elasticsearch/plugins/opendistro_performance_analyz
er/pa_config/es_security.policy -XX:MaxDirectMemorySize=536870912 -Des.path.home=/usr/share/elasticsearch -Des.pat
h.conf=/etc/elasticsearch -Des.distribution.flavor=oss -Des.distribution.type=rpm -Des.bundled_jdk=true org.elasti
csearch.bootstrap.Elasticsearch -p /var/run/elasticsearch/elasticsearch.pid --quiet

Host: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz, 1 cores, 1G, Amazon Linux release 2 (Karoo)
Time: Mon Jul 20 20:54:23 2020 UTC elapsed time: 22 seconds (0d 0h 0m 22s)

---------------  T H R E A D  ---------------

Current thread (0x00007f3fa0464800):  JavaThread "elasticsearch[node-1][generic][T#4]" daemon [_thread_in_native,
id=23335, stack(0x00007f3f88626000,0x00007f3f88727000)]

Stack: [0x00007f3f88626000,0x00007f3f88727000],  sp=0x00007f3f88720f80,  free space=1003k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libKNNIndexV1_7_3_6.so+0x1381f1]  similarity::Hnsw<float>::CreateIndex(similarity::AnyParams const&)+0x5f1

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  com.amazon.opendistroforelasticsearch.knn.index.v1736.KNNIndex.saveIndex([I[[FLjava/lang/String;[Ljava/lang/Str
ing;Ljava/lang/String;)V+0
j  com.amazon.opendistroforelasticsearch.knn.index.codec.KNN80Codec.KNN80DocValuesConsumer$1.run()Ljava/lang/Void;
+26
j  com.amazon.opendistroforelasticsearch.knn.index.codec.KNN80Codec.KNN80DocValuesConsumer$1.run()Ljava/lang/Objec
t;+1
J 530 c1 java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object; java.base
@14.0.1 (9 bytes) @ 0x00007f3fb824aacc [0x00007f3fb824a960+0x000000000000016c]
j  com.amazon.opendistroforelasticsearch.knn.index.codec.KNN80Codec.KNN80DocValuesConsumer.addKNNBinaryField(Lorg/
apache/lucene/index/FieldInfo;Lorg/apache/lucene/codecs/DocValuesProducer;)V+269
j  com.amazon.opendistroforelasticsearch.knn.index.codec.KNN80Codec.KNN80DocValuesConsumer.addBinaryField(Lorg/apa
che/lucene/index/FieldInfo;Lorg/apache/lucene/codecs/DocValuesProducer;)V+12
j  org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addBinaryField(Lorg/apache/lucene/index/
FieldInfo;Lorg/apache/lucene/codecs/DocValuesProducer;)V+7
j  org.apache.lucene.index.BinaryDocValuesWriter.flush(Lorg/apache/lucene/index/SegmentWriteState;Lorg/apache/luce
ne/index/Sorter$DocMap;Lorg/apache/lucene/codecs/DocValuesConsumer;)V+86
j  org.apache.lucene.index.DefaultIndexingChain.writeDocValues(Lorg/apache/lucene/index/SegmentWriteState;Lorg/apa
che/lucene/index/Sorter$DocMap;)V+177
j  org.apache.lucene.index.DefaultIndexingChain.flush(Lorg/apache/lucene/index/SegmentWriteState;)Lorg/apache/luce
ne/index/Sorter$DocMap;+120
j  org.apache.lucene.index.DocumentsWriterPerThread.flush(Lorg/apache/lucene/index/DocumentsWriter$FlushNotificati
ons;)Lorg/apache/lucene/index/DocumentsWriterPerThread$FlushedSegment;+373
j  org.apache.lucene.index.DocumentsWriter.doFlush(Lorg/apache/lucene/index/DocumentsWriterPerThread;)Z+143
j  org.apache.lucene.index.DocumentsWriter.flushAllThreads()J+152
j  org.apache.lucene.index.IndexWriter.prepareCommitInternal()J+157
j  org.apache.lucene.index.IndexWriter.commitInternal(Lorg/apache/lucene/index/MergePolicy;)J+93
j  org.apache.lucene.index.IndexWriter.commit()J+12
j  org.elasticsearch.index.engine.InternalEngine.commitIndexWriter(Lorg/apache/lucene/index/IndexWriter;Lorg/elast
icsearch/index/translog/Translog;Ljava/lang/String;)V+36
j  org.elasticsearch.index.engine.InternalEngine.recoverFromTranslogInternal(Lorg/elasticsearch/index/engine/Engin
e$TranslogRecoveryRunner;J)V+188
j  org.elasticsearch.index.engine.InternalEngine.recoverFromTranslog(Lorg/elasticsearch/index/engine/Engine$Transl
ogRecoveryRunner;J)Lorg/elasticsearch/index/engine/InternalEngine;+46
j  org.elasticsearch.index.engine.InternalEngine.recoverFromTranslog(Lorg/elasticsearch/index/engine/Engine$Transl
ogRecoveryRunner;J)Lorg/elasticsearch/index/engine/Engine;+3
j  org.elasticsearch.index.shard.IndexShard.openEngineAndRecoverFromTranslog()V+110
j  org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(Lorg/elasticsearch/index/shard/IndexShard;
)V+384
j  org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(Lorg/elasticsearch/index/shard/IndexShard
;)Ljava/lang/Boolean;+14
j  org.elasticsearch.index.shard.StoreRecovery$$Lambda$2961.get()Ljava/lang/Object;+8
org.elasticsearch.action.ActionListener.completeWith(Lorg/elasticsearch/action/ActionListener;Lorg/elasticsearc
h/common/CheckedSupplier;)V+1
j  org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(Lorg/elasticsearch/index/shard/IndexShard;Lorg/ela
sticsearch/action/ActionListener;)V+79
j  org.elasticsearch.index.shard.IndexShard.recoverFromStore(Lorg/elasticsearch/action/ActionListener;)V+73
j  org.elasticsearch.index.shard.IndexShard$$Lambda$2956.accept(Ljava/lang/Object;)V+8
j  org.elasticsearch.action.ActionRunnable$2.doRun()V+8
j  org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun()V+24
j  org.elasticsearch.common.util.concurrent.AbstractRunnable.run()V+1
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.ba
se@14.0.1
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@14.0.1
j  java.lang.Thread.run()V+11 java.base@14.0.1
v  ~StubRoutines::call_stub

siginfo: si_signo: 4 (SIGILL), si_code: 2 (ILL_ILLOPN), si_addr: 0x00007f3f841b11f1

---------------  S Y S T E M  ---------------

OS:Amazon Linux release 2 (Karoo)
uname:Linux 4.14.181-140.257.amzn2.x86_64 #1 SMP Wed May 27 02:17:36 UTC 2020 x86_64
OS uptime: 0 days 1:59 hours
libc:glibc 2.26 NPTL 2.26
rlimit: STACK 8192k, CORE 0k, NPROC 4096, NOFILE 65535, AS infinity, DATA infinity, FSIZE infinity
load average:0.70 0.24 0.09

/proc/meminfo:
MemTotal:2039140 kB
MemFree:120304 kB
MemAvailable:     364180 kB
Buffers:1060 kB
Cached:349796 kB
SwapCached:0 kB
Active:1612988 kB
Inactive:176056 kB
Active(anon):    1438952 kB
Inactive(anon):      316 kB
Active(file):     174036 kB
Inactive(file):   175740 kB
Unevictable:0 kB
Mlocked:0 kB
SwapTotal:0 kB
SwapFree:0 kB
Dirty:1152 kB
Writeback:0 kB
AnonPages:1438272 kB
Mapped:71320 kB
Shmem:1032 kB
Slab:69908 kB
SReclaimable:      51408 kB
SUnreclaim:18500 kB
KernelStack:3552 kB
PageTables:8644 kB
NFS_Unstable:0 kB
Bounce:0 kB
WritebackTmp:0 kB
CommitLimit:     1019568 kB
Committed_AS:    1690152 kB
VmallocTotal:   34359738367 kB
VmallocUsed:0 kB
VmallocChunk:0 kB
HardwareCorrupted:     0 kB
AnonHugePages:0 kB
ShmemHugePages:0 kB
ShmemPmdMapped:0 kB
HugePages_Total:0
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:2048 kB
DirectMap4k:81920 kB
DirectMap2M:     2015232 kB

/proc/sys/kernel/threads-max (system-wide limit on the number of threads):
15647

/proc/sys/vm/max_map_count (maximum number of memory map areas a process may have):
262144

/proc/sys/kernel/pid_max (system-wide limit on number of process identifiers):
32768

container (cgroup) information:
container_type: cgroupv1
cpu_cpuset_cpus: 0
cpu_memory_nodes: 0
active_processor_count: 1
cpu_quota: no quota
cpu_period: 100000
cpu_shares: no shares
memory_limit_in_bytes: unlimited
memory_and_swap_limit_in_bytes: unlimited
memory_soft_limit_in_bytes: unlimited
memory_usage_in_bytes: 1832062976
memory_max_usage_in_bytes: unlimited

Xen hardware-assisted virtualization detected
Steal ticks since vm start: 2
Steal ticks percentage since vm start:  0.001

CPU:total 1 (initial active 1) (1 cores per cpu, 1 threads per core) family 6 model 63 stepping 2, cmov, cx8, fxsr
, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, lzcnt, tsc, bmi1, bmi2, fma
CPU Model and flags from /proc/cpuinfo:
model name      : Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse
2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4
_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault invpcid_sin
gle pti fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt

Memory: 4k page, physical 2039140k(120304k free), swap 0k(0k free)

vm_info: OpenJDK 64-Bit Server VM (14.0.1+7) for linux-amd64 JRE (14.0.1+7), built on Apr 15 2020 15:16:40 by "jen
kins" with gcc 7.5.0
jmazanec15 commented 4 years ago

Hi @nmsaey42, sorry about this. What instance type are you using? It could be related to #163 .

pawanm09 commented 4 years ago

Hello @jmazanec15

EC2 t2.small

jmazanec15 commented 4 years ago

Yes, that type is impacted by #163

As a workaround, on a t2.instance, you can build and install from source with these instructions:

sudo yum install gcc-c++ cmake git rpm-build -y
export JAVA_HOME=/usr/share/elasticsearch/jdk
git clone https://github.com/opendistro-for-elasticsearch/k-NN.git
cd k-NN
git fetch
git checkout opendistro-1.9
cd jni
cmake .
make package

sudo yum remove opendistro-knnlib -y
sudo yum localinstall packages/opendistro-knnlib-1.9.0.0-0.1_linux.x86_64.rpm -y
sudo yum install opendistro-knn -y
pawanm09 commented 4 years ago

Hello @jmazanec15

Thanks for the instructions.

Can you also share how to solve this on a ubuntu machine?

jmazanec15 commented 4 years ago

Sure, try this for DEB installation on Ubuntu:

sudo apt install cmake g++ git rpm -y
export JAVA_HOME=/usr/share/elasticsearch/jdk
git clone https://github.com/opendistro-for-elasticsearch/k-NN.git
cd k-NN
git fetch
git checkout opendistro-1.9
cd jni
cmake .
make package

sudo apt remove opendistro-knnlib -y
sudo dpkg -i packages/opendistro-knnlib-1.9.0.0-0.1_linux.x86_64.deb
sudo apt install opendistro-knn -y

Note -- this assumes you have already followed DEB installation instructions

Please let us know if it does not work for you.

pawanm09 commented 4 years ago

Hello @jmazanec15

Both ec2 Linux & Ubuntu are working fine.

Thank You so much

jmazanec15 commented 4 years ago

No problem!