gbif / checklistbank

GBIF Checklist Bank
Apache License 2.0
29 stars 14 forks source link

Checklistbank-ws lealks DB connections? #45

Closed timrobertson100 closed 6 years ago

timrobertson100 commented 6 years ago

I found a lot of these in ELK

java.lang.Exception: Apparent connection leak detected
at org.apache.ibatis.transaction.jdbc.JdbcTransaction.openConnection(JdbcTransaction.java:138)
at org.apache.ibatis.transaction.jdbc.JdbcTransaction.getConnection(JdbcTransaction.java:60)
at org.apache.ibatis.executor.BaseExecutor.getConnection(BaseExecutor.java:336)
at org.apache.ibatis.executor.SimpleExecutor.prepareStatement(SimpleExecutor.java:84)
at org.apache.ibatis.executor.SimpleExecutor.doQuery(SimpleExecutor.java:62)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:324)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:109)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:83)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:148)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:141)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectOne(DefaultSqlSession.java:77)
at sun.reflect.GeneratedMethodAccessor128.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.ibatis.session.SqlSessionManager$SqlSessionInterceptor.invoke(SqlSessionManager.java:357)
at com.sun.proxy.$Proxy38.selectOne(Unknown Source)
at org.apache.ibatis.session.SqlSessionManager.selectOne(SqlSessionManager.java:166)
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:82)
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:59)
at com.sun.proxy.$Proxy85.count(Unknown Source)
at org.gbif.checklistbank.ws.resources.SpeciesSitemapResource.sitemapIndex(SpeciesSitemapResource.java:75)
at sun.reflect.GeneratedMethodAccessor573.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:748)

This seemingly resulted in a lot of PG connections acquired by CLB and brought down the registry services on the same DB which all reported timeouts getting connections.

I think we need to find the cause of this leak, and also consider dropping pool sizes to not starve other apps.

mdoering commented 6 years ago

A poolsize of 12 is not really excessive, so decreasing these will not help: https://github.com/gbif/gbif-configuration/blob/master/checklistbank-ws/prod/application.properties#L6

mdoering commented 6 years ago

I think the leak detection is simply triggered because the sitemap query takes a long time and we have configured the ws (hikari) to assume its a leak after 60s https://github.com/gbif/gbif-configuration/blob/master/checklistbank-ws/prod/application.properties#L10

From Hikari:

This property controls the amount of time that a connection can be out of the pool before a message is logged indicating a possible connection leak. A value of 0 means leak detection is disabled. Lowest acceptable value for enabling leak detection is 2000 (2 seconds). Default: 0

We could increase the limit or deactivate it alltogether (which is the default we use probably everywhere else)

mdoering commented 6 years ago

There are as many INFOs reporting that the connection in fact did not leak as there are WARNINGs about a potential leak, so the theory above sounds likely:

Previously reported leaked connection org.postgresql.jdbc.PgConnection@67837d28 was returned to the pool (unleaked)

So it seems we just have some slow postgres queries. Maybe some indices are bad. It is caused by various methods, not just the sitemap index.

mdoering commented 6 years ago

postgres logs are full with slow queries selecting by the canonical name (see 60% cause above). For some reason postgres uses an uneffective query plan when the canonical name exists many times in the dataset. E.g. when querying for "Abies" it takes 12ms to yield no records:

prod_checklistbank=> EXPLAIN analyze SELECT * FROM name_usage u join name n on u.name_fk=n.id WHERE u.deleted IS NULL AND lower(n.canonical_name) = lower('Abies') and u.dataset_key in ('4cec8fef-f129-4966-89b7-4f8439aba058') ORDER BY u.id  LIMIT 20 OFFSET 0;
                                                                        QUERY PLAN                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=4053.08..4053.08 rows=3 width=508) (actual time=11.098..11.098 rows=0 loops=1)
   ->  Sort  (cost=4053.08..4053.08 rows=3 width=508) (actual time=11.097..11.097 rows=0 loops=1)
         Sort Key: u.id
         Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop  (cost=1.12..4053.05 rows=3 width=508) (actual time=11.085..11.085 rows=0 loops=1)
               ->  Index Scan using name_lower_idx on name n  (cost=0.56..157.10 rows=80 width=205) (actual time=10.230..10.337 rows=19 loops=1)
                     Index Cond: (lower(canonical_name) = 'abies'::text)
               ->  Index Scan using name_usage_name_fk_idx on name_usage u  (cost=0.56..48.69 rows=1 width=303) (actual time=0.038..0.038 rows=0 loops=19)
                     Index Cond: (name_fk = n.id)
                     Filter: ((deleted IS NULL) AND (dataset_key = '4cec8fef-f129-4966-89b7-4f8439aba058'::uuid))
                     Rows Removed by Filter: 2
 Total runtime: 11.293 ms
(12 rows)

But when looking for "spec." the plan does not use the lower canonical name filter and it takes over 20s:

prod_checklistbank=> EXPLAIN analyze SELECT * FROM name_usage u join name n on u.name_fk=n.id WHERE u.deleted IS NULL AND lower(n.canonical_name) = lower('spec.') and u.dataset_key in ('4cec8fef-f129-4966-89b7-4f8439aba058') ORDER BY u.id  LIMIT 20 OFFSET 0;
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.00..8204.53 rows=20 width=508) (actual time=13639.520..13639.693 rows=20 loops=1)
   ->  Nested Loop  (cost=1.00..6169880.07 rows=15042 width=508) (actual time=13639.518..13639.684 rows=20 loops=1)
         ->  Index Scan using name_usage_pkey on name_usage u  (cost=0.56..4597352.52 rows=587565 width=303) (actual time=13639.432..13639.456 rows=20 loops=1)
               Filter: ((deleted IS NULL) AND (dataset_key = '4cec8fef-f129-4966-89b7-4f8439aba058'::uuid))
               Rows Removed by Filter: 19104809
         ->  Index Scan using name_pkey on name n  (cost=0.43..2.67 rows=1 width=205) (actual time=0.009..0.009 rows=1 loops=20)
               Index Cond: (id = u.name_fk)
               Filter: (lower(canonical_name) = 'spec.'::text)
 Total runtime: 13639.887 ms
mdoering commented 6 years ago

Pff, we have 693.344 name records with a canonical name = "spec.". That explains the problem, but makes it hard to find a simple solution

mdoering commented 6 years ago

Many of these names are badly parsed and the canonical name is very wrong:

prod_checklistbank=> SELECT * FROM name n WHERE lower(n.canonical_name) = lower('spec.') limit 50;
   id    |                  scientific_name                  | canonical_name |   type   | genus_or_above | infra_generic | specific_epithet | infra_specific_epithet | cultivar_epithet | notho_type | authors_parsed | autho
---------+---------------------------------------------------+----------------+----------+----------------+---------------+------------------+------------------------+------------------+------------+----------------+------
 4912135 | Gyrocarpus sp. Chase 317                          | spec.          | INFORMAL | Gyrocarpus     | Chase         |                  |                        |                  |            | f              |      
 4912136 | Gyrocarpus sp. DES-2011                           | spec.          | INFORMAL | Gyrocarpus     | Des-          |                  |                        |                  |            | t              |      
 4912338 | Gyrodactylus sp. AGV-2009a                        | spec.          | INFORMAL | Gyrodactylus   | Agv-          |                  |                        |                  |            | t              |      
 4912339 | Gyrodactylus sp. AGV-2009b                        | spec.          | INFORMAL | Gyrodactylus   | Agv-          |                  |                        |                  |            | t              |      
 4912343 | Gyrodactylus sp. Chile                            | spec.          | INFORMAL | Gyrodactylus   | Chile         |                  |                        |                  |            | t              |      
 4912363 | Gyrodactylus sp. HSS-2009                         | spec.          | INFORMAL | Gyrodactylus   | Hss-          |                  |                        |                  |            | t              |      
 4912385 | Gyrodactylus sp. Ladoga                           | spec.          | INFORMAL | Gyrodactylus   | Ladoga        |                  |                        |                  |            | t              |      
 4912386 | Gyrodactylus sp. Ladoga x Gyrodactylus pannonicus | spec.          | HYBRID   |                |               |                  |                        |                  |            | f              |      
 4912388 | Gyrodactylus sp. MBS-2014                         | spec.          | INFORMAL | Gyrodactylus   | Mbs-          |                  |                        |                  |            | t              |      
 4912390 | Gyrodactylus sp. MPV-2015                         | spec.          | INFORMAL | Gyrodactylus   | Mpv-          |                  |                        |                  |            | t              |      
 4912394 | Gyrodactylus sp. NKA-2015                         | spec.          | INFORMAL | Gyrodactylus   | Nka-          |                  |                        |                  |            | t              |      
 4912396 | Gyrodactylus sp. North Sea                        | spec.          | INFORMAL | Gyrodactylus   | North         |                  |                        |                  |            | t              | Sea  
 4912398 | Gyrodactylus sp. Norway-HH-2003                   | spec.          | INFORMAL | Gyrodactylus   | Norway-       |                  |                        |                  |            | f              |      
 4912407 | Gyrodactylus sp. Poland-MZ-2003                   | spec.          | INFORMAL | Gyrodactylus   | Poland-       |                  |                        |                  |            | f              |      
 4912413 | Gyrodactylus sp. Zimbabwe                         | spec.          | INFORMAL | Gyrodactylus   | Zimbabwe      |                  |                        |                  |            | t              |      
 4912428 | Gyrodactylus spec. Nordmann, 1832                 | spec.          | INFORMAL | Gyrodactylus   | Nordmann      |                  |                        |                  |            | t              |      
 4912565 | Gyrocotyle sp. Tasmania                           | spec.          | INFORMAL | Gyrocotyle     | Tasmania      |                  |                        |                  |            | t              |      
 4913863 | Gyrodinium sp. GeoB 231                           | spec.          | INFORMAL | Gyrodinium     | Geo           |                  |                        |                  |            | f              |      
 4913996 | Gyromitra sp. Gyr3                                | spec.          | INFORMAL | Gyromitra      | Gyr           |                  |                        |                  |            | f              |      
 4914171 | Gyrodactylus pannonicus X Gyrodactylus sp. Ladoga | spec.          | HYBRID   |                |               |                  |                        |                  |            | f              |      
 4914296 | Gyrodactylus pomeraniae x Gyrodactylus lavareti   | spec.          | HYBRID   |                |               |                  |                        |                  |            | f              |      
 4914360 | Gyroneuron sp. BOLD:AAI1989                       | spec.          | INFORMAL | Gyroneuron     | Bold          |                  |                        |                  |            | t              | Aai  
 4914380 | Gyroneuronella sp. AZR-2008                       | spec.          | INFORMAL | Gyroneuronella | Azr-          |                  |                        |                  |            | t              |      
 4914558 | Gyrodontium sp. BAB-5180                          | spec.          | INFORMAL | Gyrodontium    | Bab-          |                  |                        |                  |            | f              |      
 4915887 | Gyrostemon sp. Cranfield 02068672                 | spec.          | INFORMAL | Gyrostemon     | Cranfield     |                  |                        |                  |            | f              |      
 4916067 | Gyroporus sp. AWW-2009a                           | spec.          | INFORMAL | Gyroporus      | Aww-          |                  |                        |                  |            | t              |      
 4916070 | Gyroporus sp. Arora 00-429                        | spec.          | INFORMAL | Gyroporus      | Arora         |                  |                        |                  |            | f              |      
 4916072 | Gyroporus sp. Arora00-429                         | spec.          | INFORMAL | Gyroporus      | Arora         |                  |                        |                  |            | f              |      
 4916945 | Gyrovirus 4                                       | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4916951 | Gyrovirus GyV3                                    | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4916954 | Gyrovirus GyV7-SF                                 | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4916956 | Gyrovirus GyV8                                    | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4916959 | Gyrovirus GyV9                                    | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4916962 | Gyrovirus Tu243                                   | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4916965 | Gyrovirus Tu789                                   | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4916977 | Gyrovirus: Chicken anemia virus ICTV              | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4916980 | Gyrovirus: chicken anemia virus Ictv              | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4917282 | Gyrophyllum sp. NTM-C014392                       | spec.          | INFORMAL | Gyrophyllum    | Ntm-          |                  |                        |                  |            | f              |      
 4917368 | Gyrtona sp. BOLD:AAI6410                          | spec.          | INFORMAL | Gyrtona        | Bold          |                  |                        |                  |            | f              |      
 4917371 | Gyrtona sp. Gyrt                                  | spec.          | INFORMAL | Gyrtona        | Gyrt          |                  |                        |                  |            | t              |      
 4917446 | Gyrtothripa sp. BOLD:AAH4619                      | spec.          | INFORMAL | Gyrtothripa    | Bold          |                  |                        |                  |            | f              |      
 4917650 | H-1 parvovirus                                    | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4917660 | H-Pelican lacZ transformation vector              | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4917662 | H-Stinger GFP transformation vector               | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
 4918192 | HCBI8.215 virus                                   | spec.          | VIRUS    |                |               |                  |                        |                  |            | f              |      
mdoering commented 6 years ago

I have increased caching of sitemaps from 30 minutes to 1 week to avoid postgres load (The backbone rarely changes): https://github.com/gbif/c-deploy/commit/c2dc6f3081e9d509bd1c16fa045449bf9b8b3c87

mdoering commented 6 years ago

Nothing else to be done for now, closing. Improved name parsing is dealt with in the new above issue