Morningstar / kafka-offset-monitor

A small web app to monitor the progress of kafka consumers and their lag wrt the log.
Apache License 2.0
282 stars 108 forks source link

Out of Memory exception #1

Closed Rohlik closed 7 years ago

Rohlik commented 7 years ago

Hi, first, thank you for your time and this great app.

Second, I have this problem:

2017-03-14 07:20:42 ERROR OffsetGetterWeb$:103 - Failed to run scheduled task
java.sql.SQLException: [SQLITE_CANTOPEN]  Unable to open the database file (out of memory)
        at org.sqlite.DB.newSQLException(DB.java:383)
        at org.sqlite.DB.newSQLException(DB.java:387)
        at org.sqlite.DB.throwex(DB.java:374)
        at org.sqlite.NativeDB._open(Native Method)
        at org.sqlite.DB.open(DB.java:86)
        at org.sqlite.Conn.open(Conn.java:140)
        at org.sqlite.Conn.<init>(Conn.java:57)
        at org.sqlite.JDBC.createConnection(JDBC.java:77)
        at org.sqlite.JDBC.connect(JDBC.java:64)
        at java.sql.DriverManager.getConnection(DriverManager.java:664)
        at java.sql.DriverManager.getConnection(DriverManager.java:208)
        at scala.slick.jdbc.JdbcBackend$DatabaseFactoryDef$$anon$4.createConnection(JdbcBackend.scala:70)
        at scala.slick.jdbc.JdbcBackend$BaseSession.conn$lzycompute(JdbcBackend.scala:397)
        at scala.slick.jdbc.JdbcBackend$BaseSession.conn(JdbcBackend.scala:397)
        at scala.slick.jdbc.JdbcBackend$BaseSession.withTransaction(JdbcBackend.scala:420)
        at scala.slick.backend.DatabaseComponent$DatabaseDef$$anonfun$withTransaction$1.apply(DatabaseComponent.scala:54)
        at scala.slick.backend.DatabaseComponent$DatabaseDef$$anonfun$withTransaction$1.apply(DatabaseComponent.scala:54)
        at scala.slick.backend.DatabaseComponent$DatabaseDef$class.withSession(DatabaseComponent.scala:34)
        at scala.slick.jdbc.JdbcBackend$DatabaseFactoryDef$$anon$4.withSession(JdbcBackend.scala:61)
        at scala.slick.backend.DatabaseComponent$DatabaseDef$class.withTransaction(DatabaseComponent.scala:54)
        at scala.slick.jdbc.JdbcBackend$DatabaseFactoryDef$$anon$4.withTransaction(JdbcBackend.scala:61)
        at com.quantifind.kafka.offsetapp.OffsetDB.insertAll(OffsetDB.scala:66)
        at com.quantifind.kafka.offsetapp.sqlite.SQLiteOffsetInfoReporter.report(SqliteOffsetInfoReporter.scala:9)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$$anonfun$reportOffsets$1$$anonfun$apply$6$$anonfun$apply$1.apply(OffsetGetterWeb.scala:73)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$$anonfun$reportOffsets$1$$anonfun$apply$6$$anonfun$apply$1.apply(OffsetGetterWeb.scala:73)
        at scala.util.Try$.apply(Try.scala:192)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$.retryTask(OffsetGetterWeb.scala:58)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$$anonfun$reportOffsets$1$$anonfun$apply$6.apply(OffsetGetterWeb.scala:73)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$$anonfun$reportOffsets$1$$anonfun$apply$6.apply(OffsetGetterWeb.scala:73)
        at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$$anonfun$reportOffsets$1.apply(OffsetGetterWeb.scala:73)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$$anonfun$reportOffsets$1.apply(OffsetGetterWeb.scala:70)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$.reportOffsets(OffsetGetterWeb.scala:69)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$$anonfun$schedule$1.apply$mcV$sp(OffsetGetterWeb.scala:79)
        at com.quantifind.kafka.offsetapp.OffsetGetterWeb$$anon$2.run(OffsetGetterWeb.scala:48)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

I using retain 2 days and Zookeeper as storage. My version is 0.4.1

rcasey212 commented 7 years ago

Hi @Rohlik , I'm guessing it is one of the following, both permissions related:

Please check these and let me know if that solves your problem.

Rohlik commented 7 years ago

We are using root for running this app in our environment:

#ps aux
root     18241  0.4  7.3 2401064 139000 ?      Sl   bře14   7:39 java -cp KafkaOffsetMonitor-assembly-0.4.1-SNAPSHOT.jar com.quantifind.kafka.offsetapp.OffsetGetterWeb --zk x.x.x.x,x.x.x.x,x.x.x.x --port 9001 --refresh 10.seconds --retain 2.days

Permission for dir and files:

drwxr-xr-x  2 root      root           4096 14. bře 19.16 kafka-offset-monitor
-rw-r--r-- 1 root root  4391936 14. bře 19.16 offsetapp.db
-rw-r--r-- 1 root root        0 14. bře 19.16 offsetapp.db-journal

Restarting app solve my problem, for while.....then problem happening again. PS: After restart, it works for 10-11,5 hours and then error comes up.

rcasey212 commented 7 years ago

Ahh, then you are likely legitimately running out of memory.

Rohlik commented 7 years ago

As you can see I have some memory available, also swap is almost empty.

#free -m
              total        used        free      shared  buff/cache   available                                                                                                                                    
Mem:            992         640          65         103         287          81                                                                                                                                    
Swap:          1022          16        1006

How much memory I need for running offset monitoring tool without problem?

rcasey212 commented 7 years ago

Hi @Rohlik I recommend you run this as a user other than root, as you may also run into issues with the DB trying to run its native code from the /tmp directory as root, which most linux distributions do not allow be default.