ClickHouse / graphouse

Graphouse allows you to use ClickHouse as a Graphite storage.
Apache License 2.0
259 stars 55 forks source link

Graphouse drops incoming metrics during a warming-up #112

Open Felixoid opened 5 years ago

Felixoid commented 5 years ago

Hello.

I've recognized, that graphouse doesn't save all incoming metrics until his (?) metrics tree won't fill. It looks like MetricCacher does a lot of side logic synchronously and couldn't accept everything. Increasing memory consumption and deep of in-memory tree does decrease the time of warming-up, but the main problem is data losing.

Here's my custom properties:

graphouse.cacher.read-batch-size=1000
graphouse.cacher.min-batch-size=200000
graphouse.tree.in-memory-levels=7

And vmoptions:

-Xms7g
-Xmx7g
-Xss2m
-XX:StringTableSize=10000000
-XX:+UseG1GC
-XX:MaxGCPauseMillis=1000

Metrics statistic: 2019-03-16 13:16:56,786 INFO [MetricSearch MetricSearch thread] Actual metrics count = 2491274, dir count: 653674, cache stats: CacheStats{hitCount=34678346, missCount=5402061, loadSuccessCount=175284, loadExceptionCount=24, totalLoadTime=1749975810283, evictionCount=0}

This is how the number of saved metrics looks during the warming: * Until 12:10: normal work. * 12:10: restart graphouse with `graphouse.tree.in-memory-levels=6` * 12:29: warming-up was ended. *19 minutes of losing data* * 12:38: restart graphouse with `graphouse.tree.in-memory-levels=7` * 12:51: warming-up was ended. *13 minutes of losing data* ``` SELECT count(), toStartOfMinute(toDateTime(timestamp)) AS minute FROM graphite.data WHERE (timestamp >= toDateTime('2019-03-16 12:03:00')) AND (timestamp <= now()) GROUP BY minute ORDER BY minute ASC ┌─count()─┬──────────────minute─┐ │ 2148456 │ 2019-03-16 12:03:00 │ │ 2214717 │ 2019-03-16 12:04:00 │ │ 2241569 │ 2019-03-16 12:05:00 │ │ 2206572 │ 2019-03-16 12:06:00 │ │ 2220109 │ 2019-03-16 12:07:00 │ │ 2154098 │ 2019-03-16 12:08:00 │ │ 2195761 │ 2019-03-16 12:09:00 │ │ 228084 │ 2019-03-16 12:10:00 │ │ 280771 │ 2019-03-16 12:11:00 │ │ 344394 │ 2019-03-16 12:12:00 │ │ 441295 │ 2019-03-16 12:13:00 │ │ 390031 │ 2019-03-16 12:14:00 │ │ 510775 │ 2019-03-16 12:15:00 │ │ 612913 │ 2019-03-16 12:16:00 │ │ 703469 │ 2019-03-16 12:17:00 │ │ 1009436 │ 2019-03-16 12:18:00 │ │ 1436698 │ 2019-03-16 12:19:00 │ │ 1325010 │ 2019-03-16 12:20:00 │ │ 1976466 │ 2019-03-16 12:21:00 │ │ 1862134 │ 2019-03-16 12:22:00 │ │ 1914763 │ 2019-03-16 12:23:00 │ │ 1686410 │ 2019-03-16 12:24:00 │ │ 1643832 │ 2019-03-16 12:25:00 │ │ 1623769 │ 2019-03-16 12:26:00 │ │ 1492997 │ 2019-03-16 12:27:00 │ │ 1816918 │ 2019-03-16 12:28:00 │ │ 2060844 │ 2019-03-16 12:29:00 │ │ 2289562 │ 2019-03-16 12:30:00 │ │ 2191460 │ 2019-03-16 12:31:00 │ │ 2202839 │ 2019-03-16 12:32:00 │ │ 2151755 │ 2019-03-16 12:33:00 │ │ 2187679 │ 2019-03-16 12:34:00 │ │ 2222659 │ 2019-03-16 12:35:00 │ │ 2167668 │ 2019-03-16 12:36:00 │ │ 1052516 │ 2019-03-16 12:37:00 │ │ 298195 │ 2019-03-16 12:38:00 │ │ 396183 │ 2019-03-16 12:39:00 │ │ 442054 │ 2019-03-16 12:40:00 │ │ 532048 │ 2019-03-16 12:41:00 │ │ 811981 │ 2019-03-16 12:42:00 │ │ 822505 │ 2019-03-16 12:43:00 │ │ 847533 │ 2019-03-16 12:44:00 │ │ 889592 │ 2019-03-16 12:45:00 │ │ 953051 │ 2019-03-16 12:46:00 │ │ 868839 │ 2019-03-16 12:47:00 │ │ 1192575 │ 2019-03-16 12:48:00 │ │ 935722 │ 2019-03-16 12:49:00 │ │ 1322319 │ 2019-03-16 12:50:00 │ │ 2155971 │ 2019-03-16 12:51:00 │ │ 2188373 │ 2019-03-16 12:52:00 │ │ 2200898 │ 2019-03-16 12:53:00 │ │ 2202605 │ 2019-03-16 12:54:00 │ │ 2267009 │ 2019-03-16 12:55:00 │ │ 2199843 │ 2019-03-16 12:56:00 │ │ 2159995 │ 2019-03-16 12:57:00 │ │ 2195384 │ 2019-03-16 12:58:00 │ │ 2179685 │ 2019-03-16 12:59:00 │ │ 2429822 │ 2019-03-16 13:00:00 │ │ 2196273 │ 2019-03-16 13:01:00 │ │ 2204160 │ 2019-03-16 13:02:00 │ │ 2249884 │ 2019-03-16 13:03:00 │ │ 2210148 │ 2019-03-16 13:04:00 │ │ 2014119 │ 2019-03-16 13:05:00 │ └─────────┴─────────────────────┘ ```
During warming, I also see a lot of next stack-traces: ``` 2019-03-16 12:44:41,566 WARN [MetricServer pool-4-thread-71] Failed to read data com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: ru.yandex.clickhouse.except.ClickHouseUnknownException: ClickHouse exception, message: Connect to aw-graphite-ch.admin.ig.local:8123 [aw-graphite-ch.admin.ig.local/2a00:1f78:fffd:402a:0:0:0:30d, aw-graphite-ch.admin.ig.local/10.42.3.13] failed: connect timed out, host: aw-graphite-ch.admin.ig.local, port: 8123; Connect to aw-graphite-ch.admin.ig.local:8123 [aw-graphite-ch.admin.ig.local/2a00:1f78:fffd:402a:0:0:0:30d, aw-graphite-ch.admin.ig.local/10.42.3.13] failed: connect timed out at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2217) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache.get(LocalCache.java:4154) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4158) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:5147) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:5153) ~[guava-21.0.jar:?] at ru.yandex.market.graphouse.search.tree.LoadableMetricDir.getContent(LoadableMetricDir.java:22) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.tree.LoadableMetricDir.getMetrics(LoadableMetricDir.java:37) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.tree.MetricDir.getOrCreateMetric(MetricDir.java:74) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.tree.MetricTree.modify(MetricTree.java:237) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.tree.MetricTree.add(MetricTree.java:184) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.MetricSearch.add(MetricSearch.java:458) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.server.MetricFactory.createMetric(MetricFactory.java:66) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.server.MetricServer$MetricServerWorker.read(MetricServer.java:117) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.server.MetricServer$MetricServerWorker.run(MetricServer.java:99) [graphouse-1.1-SNAPSHOT.jar:?] at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.base/java.lang.Thread.run(Thread.java:834) [?:?] Caused by: java.lang.RuntimeException: ru.yandex.clickhouse.except.ClickHouseUnknownException: ClickHouse exception, message: Connect to aw-graphite-ch.admin.ig.local:8123 [aw-graphite-ch.admin.ig.local/2a00:1f78:fffd:402a:0:0:0:30d, aw-graphite-ch.admin.ig.local/10.42.3.13] failed: connect timed out, host: aw-graphite-ch.admin.ig.local, port: 8123; Connect to aw-graphite-ch.admin.ig.local:8123 [aw-graphite-ch.admin.ig.local/2a00:1f78:fffd:402a:0:0:0:30d, aw-graphite-ch.admin.ig.local/10.42.3.13] failed: connect timed out at ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseConnectionImpl.(ClickHouseConnectionImpl.java:78) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDataSource.getConnection(ClickHouseDataSource.java:44) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDataSource.getConnection(ClickHouseDataSource.java:15) ~[clickhouse-jdbc-0.1.50.jar:?] at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:630) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:695) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:727) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:752) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:767) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at ru.yandex.market.graphouse.search.MetricSearch.loadDirsContent(MetricSearch.java:220) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.tree.DirContentBatcher.loadDirContent(DirContentBatcher.java:56) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.MetricSearch$1.load(MetricSearch.java:159) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.MetricSearch$1.load(MetricSearch.java:156) ~[graphouse-1.1-SNAPSHOT.jar:?] at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3716) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2424) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2298) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2211) ~[guava-21.0.jar:?] ... 18 more Caused by: ru.yandex.clickhouse.except.ClickHouseUnknownException: ClickHouse exception, message: Connect to aw-graphite-ch.admin.ig.local:8123 [aw-graphite-ch.admin.ig.local/2a00:1f78:fffd:402a:0:0:0:30d, aw-graphite-ch.admin.ig.local/10.42.3.13] failed: connect timed out, host: aw-graphite-ch.admin.ig.local, port: 8123; Connect to aw-graphite-ch.admin.ig.local:8123 [aw-graphite-ch.admin.ig.local/2a00:1f78:fffd:402a:0:0:0:30d, aw-graphite-ch.admin.ig.local/10.42.3.13] failed: connect timed out at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:62) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:24) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:633) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:127) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:110) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:105) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:89) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseConnectionImpl.(ClickHouseConnectionImpl.java:78) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDataSource.getConnection(ClickHouseDataSource.java:44) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDataSource.getConnection(ClickHouseDataSource.java:15) ~[clickhouse-jdbc-0.1.50.jar:?] at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:630) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:695) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:727) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:752) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:767) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at ru.yandex.market.graphouse.search.MetricSearch.loadDirsContent(MetricSearch.java:220) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.tree.DirContentBatcher.loadDirContent(DirContentBatcher.java:56) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.MetricSearch$1.load(MetricSearch.java:159) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.MetricSearch$1.load(MetricSearch.java:156) ~[graphouse-1.1-SNAPSHOT.jar:?] at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3716) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2424) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2298) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2211) ~[guava-21.0.jar:?] ... 18 more Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to aw-graphite-ch.admin.ig.local:8123 [aw-graphite-ch.admin.ig.local/2a00:1f78:fffd:402a:0:0:0:30d, aw-graphite-ch.admin.ig.local/10.42.3.13] failed: connect timed out at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:150) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) ~[httpclient-4.5.2.jar:4.5.2] at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:127) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:110) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:105) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:89) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseConnectionImpl.(ClickHouseConnectionImpl.java:78) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDataSource.getConnection(ClickHouseDataSource.java:44) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDataSource.getConnection(ClickHouseDataSource.java:15) ~[clickhouse-jdbc-0.1.50.jar:?] at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:630) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:695) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:727) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:752) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:767) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at ru.yandex.market.graphouse.search.MetricSearch.loadDirsContent(MetricSearch.java:220) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.tree.DirContentBatcher.loadDirContent(DirContentBatcher.java:56) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.MetricSearch$1.load(MetricSearch.java:159) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.MetricSearch$1.load(MetricSearch.java:156) ~[graphouse-1.1-SNAPSHOT.jar:?] at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3716) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2424) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2298) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2211) ~[guava-21.0.jar:?] ... 18 more Caused by: java.net.SocketTimeoutException: connect timed out at java.base/java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?] at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399) ~[?:?] at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242) ~[?:?] at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224) ~[?:?] at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403) ~[?:?] at java.base/java.net.Socket.connect(Socket.java:591) ~[?:?] at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:74) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) ~[httpclient-4.5.2.jar:4.5.2] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) ~[httpclient-4.5.2.jar:4.5.2] at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:127) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:110) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:105) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:89) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseConnectionImpl.(ClickHouseConnectionImpl.java:78) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDataSource.getConnection(ClickHouseDataSource.java:44) ~[clickhouse-jdbc-0.1.50.jar:?] at ru.yandex.clickhouse.ClickHouseDataSource.getConnection(ClickHouseDataSource.java:15) ~[clickhouse-jdbc-0.1.50.jar:?] at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:630) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:695) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:727) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:752) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:767) ~[spring-jdbc-4.1.6.RELEASE.jar:4.1.6.RELEASE] at ru.yandex.market.graphouse.search.MetricSearch.loadDirsContent(MetricSearch.java:220) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.tree.DirContentBatcher.loadDirContent(DirContentBatcher.java:56) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.MetricSearch$1.load(MetricSearch.java:159) ~[graphouse-1.1-SNAPSHOT.jar:?] at ru.yandex.market.graphouse.search.MetricSearch$1.load(MetricSearch.java:156) ~[graphouse-1.1-SNAPSHOT.jar:?] at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3716) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2424) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2298) ~[guava-21.0.jar:?] at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2211) ~[guava-21.0.jar:?] ... 18 more ``` This is side problem: ru.yandex.market.graphouse.search.tree.DirContentBatcher.loadDirContent always executes [this code](https://github.com/yandex/graphouse/blob/master/src/main/java/ru/yandex/market/graphouse/search/tree/DirContentBatcher.java#L56) and doesn't create batches. This overload ClickHouse a lot and cause DB side lags.
Felixoid commented 5 years ago

Proof of concept: https://github.com/yandex/graphouse/pull/113

Felixoid commented 5 years ago

As was investigated, graphous doesn't drop metrics, they would be received after warming is over. But the slow start is an issue when metrics are forwarded with carbon-c-relay

alkedr commented 5 years ago

This issue might have been fixed by https://github.com/ClickHouse/graphouse/pull/134.

Felixoid commented 5 years ago

Still in place after #135