xtdb / xtdb

An immutable SQL database for application development, time-travel reporting and data compliance. Developed by @juxt
https://xtdb.com
Mozilla Public License 2.0
2.53k stars 165 forks source link

Fatal error when bulk inserting data #1466

Closed tkovis closed 3 years ago

tkovis commented 3 years ago

I've been trying to find a smart way to bulk insert a dataset with 200k entries and each entry containing information about 4-8 entities. I'm checking unique constraints both before the transaction (crux/q + filter if constraint fails) and in a transaction function.

; resources/config.edn
{:crux/index-store {:kv-store {:crux/module crux.rocksdb/->kv-store
                               :db-dir "/tmp/rocksdb/index"}}
 :crux/document-store {:kv-store {:crux/module crux.rocksdb/->kv-store
                                :db-dir "/tmp/rocksdb/document"}}
 :crux/tx-log {:kv-store {:crux/module crux.rocksdb/->kv-store
                                :db-dir "/tmp/rocksdb/tx"}}
 :crux.lucene/lucene-store {:db-dir "/tmp/lucene"}}

; deps.edn
{:paths ["src" "resources"]
 :deps {juxt/crux-core {:mvn/version "21.02-1.15.0-beta"}
        juxt/crux-rocksdb {:mvn/version "21.02-1.15.0-beta"}
        juxt/crux-lucene {:mvn/version "21.02-1.15.0-alpha"}
        org.clojure/data.csv {:mvn/version "1.0.0"}
        io.randomseed/phone-number {:mvn/version "8.12.16-1"}
        ch.qos.logback/logback-classic {:mvn/version "1.2.3"}}
 :aliases
 {:socket
  {:jvm-opts ["-Dclojure.server.repl={:address,\"0.0.0.0\",:port,5555,:accept,clojure.core.server/repl}"]}}}
(ns tutorial.repro
  (:require
   [crux.api :as crux]
   [clojure.java.io :as io]))

(with-open [node (crux/start-node (io/file "resources/config.edn"))]
  (doall
   (->> (range 200000)
        (map (fn [_] [:crux.tx/put {:crux.db/id (java.util.UUID/randomUUID)
                                    :string (.toString (java.util.UUID/randomUUID))}]))
        (partition-all 10000)
        (map #(crux/submit-tx node %)))))
=> (#:crux.tx{:tx-id 141, :tx-time #inst "2021-03-19T08:40:42.390-00:00"} #:crux.tx{:tx-id 142, :tx-time #inst "2021-03-19T08:40:42.813-00:00"} #:crux.tx{:tx-id 143, :tx-time #inst "2021-03-19T08:40:43.216-00:00"} #:crux.tx{:tx-id 144, :tx-time #inst "2021-03-19T08:40:43.557-00:00"} #:crux.tx{:tx-id 145, :tx-time #inst "2021-03-19T08:40:43.861-00:00"} #:crux.tx{:tx-id 146, :tx-time #inst "2021-03-19T08:40:44.083-00:00"} #:crux.tx{:tx-id 147, :tx-time #inst "2021-03-19T08:40:44.271-00:00"} #:crux.tx{:tx-id 148, :tx-time #inst "2021-03-19T08:40:44.427-00:00"} #:crux.tx{:tx-id 149, :tx-time #inst "2021-03-19T08:40:44.613-00:00"} #:crux.tx{:tx-id 150, :tx-time #inst "2021-03-19T08:40:44.838-00:00"} #:crux.tx{:tx-id 151, :tx-time #inst "2021-03-19T08:40:45.058-00:00"} #:crux.tx{:tx-id 152, :tx-time #inst "2021-03-19T08:40:45.254-00:00"} #:crux.tx{:tx-id 153, :tx-time #inst "2021-03-19T08:40:45.430-00:00"} #:crux.tx{:tx-id 154, :tx-time #inst "2021-03-19T08:40:45.617-00:00"} #:crux.tx{:tx-id 155, :tx-time #inst "2021-03-19T08:40:45.793-00:00"} #:crux.tx{:tx-id 156, :tx-time #inst "2021-03-19T08:40:46.001-00:00"} #:crux.tx{:tx-id 157, :tx-time #inst "2021-03-19T08:40:46.137-00:00"} #:crux.tx{:tx-id 158, :tx-time #inst "2021-03-19T08:40:46.272-00:00"} #:crux.tx{:tx-id 159, :tx-time #inst "2021-03-19T08:40:46.407-00:00"} #:crux.tx{:tx-id 160, :tx-time #inst "2021-03-19T08:40:46.542-00:00"})
10:36:32.041 [nREPL-session-5dd7dec5-f159-47d0-b3a6-b89083473d23] DEBUG crux.memory - :pool-allocation-stats {:allocated 101842944, :deallocated 3145728, :in-use 98697216}
10:36:32.044 [nREPL-session-5dd7dec5-f159-47d0-b3a6-b89083473d23] DEBUG crux.memory - :pool-allocation-stats {:allocated 101974016, :deallocated 3145728, :in-use 98828288}
10:36:32.047 [nREPL-session-5dd7dec5-f159-47d0-b3a6-b89083473d23] DEBUG crux.memory - :pool-allocation-stats {:allocated 102105088, :deallocated 3145728, :in-use 98959360}
10:36:37.066 [nREPL-session-5dd7dec5-f159-47d0-b3a6-b89083473d23] WARN crux.kv.tx-log - waited 5s for tx-ingest-executor to exit, no dice.
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000138cc6e3d, pid=30370, tid=86039
#
# JRE version: OpenJDK Runtime Environment AdoptOpenJDK (11.0.10+9) (build 11.0.10+9)
# Java VM: OpenJDK 64-Bit Server VM AdoptOpenJDK (11.0.10+9, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64)
# Problematic frame:
# 
C  [librocksdbjni-osx.jnilib+0xa4e3d]
  rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, rocksdb::Slice const&, rocksdb::DBImpl::GetImplOptions&)+0x16d
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/tuomaskoivistoinen/git/crux/hs_err_pid30370.log
[thread 34819 also had an error]
#
# If you would like to submit a bug report, please visit:
#   https://github.com/AdoptOpenJDK/openjdk-support/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Jack-in process exited. Status: null

hs_err_pid27361.log hs_err_pid30052.log hs_err_pid30102.log hs_err_pid30194.log hs_err_pid30309.log hs_err_pid30370.log

Also sometimes I got

10:51:40.331 [crux-standalone-tx-ingest-1] DEBUG crux.memory - :pool-allocation-stats {:allocated 125698048, :deallocated 917504, :in-use 124780544}
pthread lock: Invalid argument
Jack-in process exited. Status: null

Same errors without partitioning and with an empty db

(ns tutorial.repro
  (:require
   [crux.api :as crux]
   [clojure.java.io :as io]))

(with-open [node (crux/start-node (io/file "resources/config.edn"))]
  (doall
   (->> (range 200000)
        (map (fn [_] [:crux.tx/put {:crux.db/id (java.util.UUID/randomUUID)
                                    :string (.toString (java.util.UUID/randomUUID))}]))
        (crux/submit-tx node))))
=> #:crux.tx{:tx-id 0, :tx-time #inst "2021-03-19T08:43:36.184-00:00"}

Playing around with different ranges (10k,20k,30k....) I could also produce this without a fatal exception at 50k

10:53:48.733 [nREPL-session-83693770-d2e8-4ca8-af57-4f34050a2f58] WARN crux.kv.tx-log - waited 5s for tx-ingest-executor to exit, no dice.
Exception in thread "Lucene Merge Thread #0" 
org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.store.AlreadyClosedException: this Directory is closed
        at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:703)

        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:694)
Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is closed
        at org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:51)
        at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:247)
        at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44)

        at org.apache.lucene.index.ConcurrentMergeScheduler$1.createOutput(ConcurrentMergeScheduler.java:289)
        at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
        at org.apache.lucene.codecs.lucene60.Lucene60FieldInfosFormat.write(Lucene60FieldInfosFormat.java:272)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:184)

        at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4586)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4190)
        at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5741)

        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:624)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:682)

but after the previous error inserting just 10 resulted

10:54:50.067 [nREPL-session-83693770-d2e8-4ca8-af57-4f34050a2f58] WARN crux.kv.tx-log - waited 5s for tx-ingest-executor to exit, no dice.
Exception in thread "Lucene Merge Thread #0" 
org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.store.AlreadyClosedException: this Directory is closed
        at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:703)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:694)
Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is closed
        at org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:51)
        at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:247)
        at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44)
        at org.apache.lucene.index.ConcurrentMergeScheduler$1.createOutput(ConcurrentMergeScheduler.java:289)
        at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
        at org.apache.lucene.codecs.lucene60.Lucene60FieldInfosFormat.write(Lucene60FieldInfosFormat.java:272)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:184)
        at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4586)

        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4190)
        at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5741)
        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:624)

        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:682)
10:54:51.192 [crux-standalone-tx-ingest-1] DEBUG crux.tx - Indexing tx-id: 6
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000000013bccf416, pid=30802, tid=48203
#
# JRE version: OpenJDK Runtime Environment AdoptOpenJDK (11.0.10+9) (build 11.0.10+9)
# Java VM: OpenJDK 64-Bit Server VM AdoptOpenJDK (11.0.10+9, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64)
# Problematic frame:
# 
C
  [librocksdbjni-osx.jnilib+0x1d416]
  Java_org_rocksdb_RocksDB_getSnapshot+0x6
#
# 
No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/tuomaskoivistoinen/git/crux/hs_err_pid30802.log
#
# If you would like to submit a bug report, please visit:
#   https://github.com/AdoptOpenJDK/openjdk-support/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Jack-in process exited. Status: null

With further testing I found out that everything is ok if you use await-tx

(ns tutorial.repro
  (:require
   [crux.api :as crux]
   [clojure.java.io :as io]))

(time ; indexes everything without errors in 30s
 (with-open [node (crux/start-node (io/file "resources/config.edn"))]
   (doall
    (->> (range 200000)
         (map (fn [_] [:crux.tx/put {:crux.db/id (java.util.UUID/randomUUID)
                                     :string (.toString (java.util.UUID/randomUUID))}]))
         (crux/submit-tx node)
         (crux/await-tx node)))))

Not sure if this is a real issue anymore

jarohen commented 3 years ago

Hey @tkovis, thanks for the report :pray: This looks like a duplicate of #1316 which was fixed in #1445 - this fix will be included in the next release. In the meantime, in your case (because you have a fixed set of transactions), await-tx at the end is indeed a reasonable workaround.

Unrelated - generally speaking we tend to find that transactions around the ~1k size perform best in Crux. It obviously depends hugely on the content of the documents, but 1k seems a good rule-of thumb - above 1k seems to have diminishing returns. You only need then await the final tx, so that you're not waiting for Crux to catch up after every transaction.

HTH!

James

refset commented 3 years ago

From experience I suspect the "problematic frame" Java_org_rocksdb_RocksDB_getSnapshot is indicative of a native Out-Of-Memory event. Adding await-tx will certainly be reducing the overall memory pressure to avoid this possibility.

Edit: I may well be mistaken :)

jarohen commented 3 years ago

@refset it's not an OOM event - this throws a native error because getSnapshot was being called on a closed RocksDB node, throwing the segfault.