wso2 / product-is

Welcome to the WSO2 Identity Server source code! For info on working with the WSO2 Identity Server repository and contributing code, click the link below.
http://wso2.github.io/
Apache License 2.0
742 stars 720 forks source link

WSO2 IS Kubernetes - tenant creation error and PoolExhaustedException #10540

Open angeloimm opened 3 years ago

angeloimm commented 3 years ago

Hi We are using WSO2 IS 5.10 deployed on private kubernetes cluster.

We installed it and it was working. Then we tried to add a tenant and we got an error. In the log file we saw the following: Caused by: org.wso2.carbon.registry.core.exceptions.RegistryException: Unable to put resource Failed to add properties to the resource /_system/governance/repository/security/key-stores/tenant-demo-it.jks. ERRORE: il valore è troppo lungo per il tipo character varying(1000) at org.wso2.carbon.registry.core.jdbc.handlers.builtin.MountHandler.put(MountHandler.java:299) at org.wso2.carbon.registry.core.jdbc.handlers.HandlerManager.put(HandlerManager.java:2505) at org.wso2.carbon.registry.core.jdbc.handlers.UserDefinedHandlerManager.put(UserDefinedHandlerManager.java:204) at org.wso2.carbon.registry.core.jdbc.handlers.HandlerLifecycleManager.put(HandlerLifecycleManager.java:950) at org.wso2.carbon.registry.core.jdbc.EmbeddedRegistry.put(EmbeddedRegistry.java:705) at org.wso2.carbon.registry.core.caching.CacheBackedRegistry.put(CacheBackedRegistry.java:591) at org.wso2.carbon.registry.core.session.UserRegistry.putInternal(UserRegistry.java:840) at org.wso2.carbon.registry.core.session.UserRegistry.access$1000(UserRegistry.java:73) at org.wso2.carbon.registry.core.session.UserRegistry$11.run(UserRegistry.java:816) at org.wso2.carbon.registry.core.session.UserRegistry$11.run(UserRegistry.java:813) at java.base/java.security.AccessController.doPrivileged(Native Method) at org.wso2.carbon.registry.core.session.UserRegistry.put(UserRegistry.java:813) at org.wso2.carbon.security.keystore.KeyStoreAdmin.addKeyStore(KeyStoreAdmin.java:260) ... 86 more Caused by: org.wso2.carbon.registry.core.exceptions.RegistryException: Failed to add properties to the resource /_system/governance/repository/security/key-stores/tenant-demo-it.jks. ERRORE: il valore è troppo lungo per il tipo character varying(1000) at org.wso2.carbon.registry.core.jdbc.dao.JDBCResourceDAO.addProperties(JDBCResourceDAO.java:1653) at org.wso2.carbon.registry.core.jdbc.dao.JDBCResourceDAO.add(JDBCResourceDAO.java:457)

We solved it by modifying the 2 columns in 2 tables. Basically we modifed the column description of table reg_resource and the column reg_value of table reg_property by increasing their dimension from 1000 to 4000.

By doing this modification we successfully created the tenant. But then we noticed this error:

Caused by: org.wso2.carbon.identity.base.IdentityRuntimeException: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [https-jsse-nio-9443-exec-15] Timeout: Pool empty. Unable to fetch a connection in 60 seconds, none available[size:5; busy:5; idle:0; lastwait:60000].

I attach the full container log

This is a fragment of my deployment.toml configuration: `[user_store] type = "database_unique_id" properties.CaseInsensitiveUsername = false

[database.user] url = "jdbc:postgresql://host:port/WSO2IS_SHARED_DB" username = "username" password = "password" driver = "org.postgresql.Driver"

[database.user.pool_options] maxActive = "5" maxWait = "60000" minIdle = "1" maxIdle = "2" testOnBorrow = true validationQuery = "SELECT 1; COMMIT" validationInterval = "30000" defaultAutoCommit = false

[database.identity_db] url = "jdbc:postgresql://host:port/WSO2IS_IDENTITY_DB" username = "username" password = "password" driver = "org.postgresql.Driver"

[database.identity_db.pool_options] maxActive = "5" maxWait = "60000" minIdle = "1" maxIdle = "2" testOnBorrow = true validationQuery = "SELECT 1; COMMIT" validationInterval = "30000" defaultAutoCommit = false

[database.shared_db] url = "jdbc:postgresql://host:port/WSO2IS_SHARED_DB" username = "username" password = "password" driver = "org.postgresql.Driver"

[database.shared_db.pool_options] maxActive = "5" maxWait = "60000" minIdle = "1" maxIdle = "2" testOnBorrow = true validationQuery = "SELECT 1; COMMIT" validationInterval = "30000" defaultAutoCommit = false

[database.bps_database] url = "jdbc:postgresql://host:port/WSO2IS_BPS_DB" username = "username" password = "password" driver = "org.postgresql.Driver"

[database.bps_database.pool_options] maxActive = "5" maxWait = "60000" minIdle = "1" maxIdle = "2" testOnBorrow = true validationQuery = "SELECT 1; COMMIT" validationInterval = "30000" defaultAutoCommit = true

[[datasource]] id = "WSO2ConsentDS" url = "jdbc:postgresql://host:port/WSO2IS_CONSENT_DB" username = "username" password = "password" driver = "org.postgresql.Driver" pool_options.validationQuery = "SELECT 1; COMMIT" pool_options.maxActive = 50 pool_options.maxWait = 60000 # wait in milliseconds pool_options.testOnBorrow = true pool_options.jmxEnabled = false [authentication.consent] data_source = "jdbc/WSO2ConsentDS"`

Now we can't use the IS anymore. How can we solve this issue? ws02_is.log

ruwanta commented 3 years ago

There can be few reasons for this case. One or more of them might affect on your situation.

  1. Identity database max-active is 5. So you are not able to handle more than 5 requests at the same time. Suggest to increase max pool size fits to your load.
  2. Your DB may be slow in responding with queries for the load. You can check DB time take by enabling the Observability logs in IS.
  3. Increasing the column size would have caused the Index to be gone wild. Please check the indexes on the changed tables.
angeloimm commented 3 years ago

Actually we are 2 person in using the environment, this is the reason we put the max-active to 5; but I'll try to upgrade the max-active

I'll check the other 2 suggestion... but I saw also this bug https://github.com/wso2/product-is/issues/6823 and I don't know if it can be related to this bug

ruwanta commented 3 years ago

Issue #6823 relate to large concurrency, more than 500 simultaneous requests. In here you may be sending more than 5 simultaneous requests, which is larger than configured value.

ruwanta commented 3 years ago

BTW, better to join the slack channel bit.ly/wso2isslack

angeloimm commented 3 years ago

We investigated the issue. By adding a new tenant or activating/de-activating a tenant, the connection pool is totally consumed. On the postgreSQL server (in the pg_stat_activity table) we saw: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED once arrived to the max pool size (in our case 5) the pool is exhausted. Now I can also increment the pool connection to 50 (it's always a demo environment) but I don't know if this is a bug related to the fact that connection is never released.

ruwanta commented 3 years ago

Postgre and IS has some odd behaviour. Can you try changing defaultAutoCommit value (true | false) and see of the connection get released properly.

angeloimm commented 3 years ago

I changed the defaultAutoCommit in true in my identity pool so now it's

[database.identity_db.pool_options]
maxActive = "5"
maxWait = "60000"
minIdle = "1"
maxIdle = "2"
testOnBorrow = true
validationQuery = "SELECT 1;"
validationInterval = "30000"

After 3 clicks on activate/deactivate tentant, connection pool is exhausted once again. And note i'm the only one visiting application

senthalan commented 3 years ago

Hi @angeloimm ,

Please add the following configuration to identity_db.poop_options so that we can narrow down which connection is not released by the server,

[database.identity_db.pool_options]
logAbandoned=true
removeAbandoned=true
removeAbandonedTimeout="60"

By adding these properties, the stack trace of open connections more than 60 seconds will be logged to the wso2carbon.log file and those connections will be closed. This is the expected log,

TID: [-1] [] [2019-10-24 18:28:37,478] WARN {org.apache.tomcat.jdbc.pool.ConnectionPool} - Connection has been abandoned PooledConnection[com.mysql.cj.jdbc.ConnectionImpl@18a3df3d]:java.lang.Exception

Reproduce the issue and collect the logs. Then remove the applied configuration after retrieving the logs, because enabling these properties will add overhead during the connection creation time and this will affect the performance of the system.

senthalan commented 3 years ago

Hi @angeloimm ,

Before trying out the above, can you change the defaultAutoCommit to false and tryout first, because the default value for defaultAutoCommit is true.

[database.identity_db.pool_options]
defaultAutoCommit = false
angeloimm commented 3 years ago

Hi @senthalan I tried both with

[database.identity_db.pool_options]
defaultAutoCommit = false

and

[database.identity_db.pool_options]
defaultAutoCommit = true

I had the same result in both tests

angeloimm commented 3 years ago

hi @senthalan I added the suggested configuration.

Now in my deployment.toml I have:

            [database.identity_db.pool_options]
            maxActive = "5"
            maxWait = "60000"
            minIdle = "1"
            maxIdle = "2"
            testOnBorrow = true
            validationQuery = "SELECT 1; COMMIT"
            validationInterval = "30000"
            defaultAutoCommit = false
            logAbandoned=true
            removeAbandoned=true
            removeAbandonedTimeout="60"

Effectively connection pool is freed after 60 seconds and the system is usable.

This is a log fragment

[2020-11-18 18:32:03,495] [4d5aacdf-61ee-40ec-a4d7-08463398aed1]  INFO {org.wso2.carbon.tenant.mgt.services.TenantMgtAdminService} - Activated the tenant 'tenant3.demo.it [5]' by 'admin'
[2020-11-18 18:32:06,107] [ef5bc8c0-e01c-4226-a73e-56841236eb8c]  INFO {org.wso2.carbon.tenant.mgt.services.TenantMgtAdminService} - Activated the tenant 'tenant4.demo.it [6]' by 'admin'
[2020-11-18 18:32:08,916] [c4a7a19d-5a8f-4b48-875a-6e974b1993a8]  INFO {org.wso2.carbon.tenant.mgt.services.TenantMgtAdminService} - Deactivated the tenant 'tenant4.demo.it [6]' by 'admin'
[2020-11-18 18:32:11,521] [8dcc3d36-9e2e-4b42-a3fe-9bbfed4bf384]  INFO {org.wso2.carbon.tenant.mgt.services.TenantMgtAdminService} - Deactivated the tenant 'tenant3.demo.it [5]' by 'admin'
[2020-11-18 18:32:14,345] [45f67e48-46f3-4ef0-bc4c-2089056dbcd7]  INFO {org.wso2.carbon.tenant.mgt.services.TenantMgtAdminService} - Activated the tenant 'tenant5.demo.it [7]' by 'admin'
[2020-11-18 18:33:09,319] []  WARN {org.apache.tomcat.jdbc.pool.ConnectionPool} - Connection has been abandoned PooledConnection[org.postgresql.jdbc.PgConnection@965ac6a]:java.lang.Exception
    at org.apache.tomcat.jdbc.pool.ConnectionPool.getThreadDump(ConnectionPool.java:1107)
    at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:811)
    at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:655)
    at org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:198)
    at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:132)
    at org.wso2.carbon.identity.core.persistence.JDBCPersistenceManager.getDBConnection(JDBCPersistenceManager.java:147)
    at org.wso2.carbon.identity.core.util.IdentityDatabaseUtil.getDBConnection(IdentityDatabaseUtil.java:54)
    at org.wso2.carbon.identity.core.util.IdentityDatabaseUtil.getDBConnection(IdentityDatabaseUtil.java:43)
    at org.wso2.carbon.identity.oauth2.dao.AccessTokenDAOImpl.revokeAccessTokensInBatch(AccessTokenDAOImpl.java:1073)

After that tomcat connection pool frees the connection, we can use the system

It seems to me some kind of bug in connection closing.

I attach the full log downloaded from my kubernetes cluster

Angelo wso2carbon.log

angeloimm commented 3 years ago

@senthalan we made a step further We build the kubernetes POD by using this WSO2 image:FROM wso2/wso2is:5.10.0-centos7

It seems that this image is affected by this issue. We downloaded the last ZIP from the wso2 is site and this bug is not present Is there any update official WSO2 IS image present on docker hub?

angeloimm commented 3 years ago

We guess we found the bug. In the following class AccessTokenDAOImpl.java we found this method:

public void revokeAccessTokensInBatch(String[] tokens, boolean isHashedToken)
    throws IdentityOAuth2Exception
  {
    if (log.isDebugEnabled()) {
      if (IdentityUtil.isTokenLoggable("AccessToken"))
      {
        StringBuilder stringBuilder = new StringBuilder();
        for (String token : tokens) {
          stringBuilder.append(DigestUtils.sha256Hex(token)).append(" ");
        }
        log.debug("Revoking access tokens(hashed): " + stringBuilder.toString());
      }
      else
      {
        log.debug("Revoking access tokens in batch mode");
      }
    }
    String accessTokenStoreTable = "IDN_OAUTH2_ACCESS_TOKEN";
    Connection connection = IdentityDatabaseUtil.getDBConnection();
    PreparedStatement ps = null;
    if (tokens.length > 1) {
      try
      {
        Object oldTokens = new ArrayList();
        String sqlQuery = "UPDATE IDN_OAUTH2_ACCESS_TOKEN SET TOKEN_STATE=?, TOKEN_STATE_ID=? WHERE ACCESS_TOKEN_HASH=?".replace("IDN_OAUTH2_ACCESS_TOKEN", accessTokenStoreTable);

        ps = connection.prepareStatement(sqlQuery);
        for (String token : tokens)
        {
          ps.setString(1, "REVOKED");
          ps.setString(2, UUID.randomUUID().toString());
          if (isHashedToken) {
            ps.setString(3, token);
          } else {
            ps.setString(3, getHashingPersistenceProcessor().getProcessedAccessTokenIdentifier(token));
          }
          ps.addBatch();
          ((List)oldTokens).add(getHashingPersistenceProcessor().getProcessedAccessTokenIdentifier(token));
        }
        ps.executeBatch();
        IdentityDatabaseUtil.commitTransaction(connection);

        OAuth2TokenUtil.postUpdateAccessTokens(Arrays.asList(tokens), "REVOKED");
        if (this.isTokenCleanupFeatureEnabled) {
          this.oldTokenCleanupObject.cleanupTokensInBatch((List)oldTokens, connection);
        }
      }
      catch (SQLException e)
      {
        IdentityDatabaseUtil.rollbackTransaction(connection);

        throw new IdentityOAuth2Exception("Error occurred while revoking Access Tokens : " + Arrays.toString(tokens), e);
      }
      finally
      {
        IdentityDatabaseUtil.closeAllConnections(connection, null, ps);
      }
    }
    if (tokens.length == 1) {
      try
      {
        connection.setAutoCommit(true);
        String sqlQuery = "UPDATE IDN_OAUTH2_ACCESS_TOKEN SET TOKEN_STATE=?, TOKEN_STATE_ID=? WHERE ACCESS_TOKEN_HASH=?".replace("IDN_OAUTH2_ACCESS_TOKEN", accessTokenStoreTable);

        ps = connection.prepareStatement(sqlQuery);
        ps.setString(1, "REVOKED");
        ps.setString(2, UUID.randomUUID().toString());
        if (isHashedToken) {
          ps.setString(3, tokens[0]);
        } else {
          ps.setString(3, getHashingPersistenceProcessor().getProcessedAccessTokenIdentifier(tokens[0]));
        }
        ps.executeUpdate();

        OAuth2TokenUtil.postUpdateAccessTokens(Arrays.asList(tokens), "REVOKED");
        if (this.isTokenCleanupFeatureEnabled) {
          this.oldTokenCleanupObject.cleanupTokenByTokenValue(
            getHashingPersistenceProcessor().getProcessedAccessTokenIdentifier(tokens[0]), connection);
        }
      }
      catch (SQLException e)
      {
        throw new IdentityOAuth2Exception("Error occurred while revoking Access Token : " + Arrays.toString(tokens), e);
      }
      finally
      {
        IdentityDatabaseUtil.closeAllConnections(connection, null, ps);
      }
    }
  }

By reading this code, it seems to open the DB connection ( Connection connection = IdentityDatabaseUtil.getDBConnection(); ) but if tokens.length == 0 the opened connection is never closed.

I hope it can be useful.

Angelo

angeloimm commented 3 years ago

I confirm that the bug was the one we found. We modified the JAR and deployed the new WSO2 IS image in our K8S cluster with the new org.wso2.carbon.identity.oauth_6.4.2.jar All works pretty good.

ruwanta commented 3 years ago

Great work Angelo. Much appreciated finding the root cause. Can you also send a pull request to the source code. It will help for many others.

angeloimm commented 3 years ago

@ruwanta How can I send a pull request? I should compare 2 branches. If I clone the repository and I create my own branch, I can't pull it

ruwanta commented 3 years ago

Hi Angelo, I believe you pulled the source code from the github repository.

  1. Fork it
  2. Clone it to your local system
  3. Make a new branch from the branch you took
  4. Do the same change on this branch. You can paste the files
  5. Push it back to your repo
  6. Click the Compare & pull request button on your Github account
  7. Click Create pull request to open a new pull request

https://opensource.com/article/19/7/create-pull-request-github

Or else, you can attach a diff file on this issue too. However applying a change with a diff is an old technique.

angeloimm commented 3 years ago

I noticed that in master the check has been added:

        if (ArrayUtils.isEmpty(tokens)) {
            if (log.isDebugEnabled()) {
                log.debug("No tokens to revoke in batch mode. Therefore not continuing further in revocation.");
            }
            return;
        }

I guess it's useless I add it now :)

angeloimm commented 3 years ago

hi there.

We are using the docker image of WSO2 IS 5.10 available on docker hub to this link https://hub.docker.com/r/wso2/wso2is We are facing some issues in multi-tenant management as you can check on this github issue https://github.com/wso2/product-is/issues/10540

We checked the new WSO2 is version downloadable on the site. It seems to us that the issues we found in our docker image are no more presents

What we would like to know is:

Thank you Angelo

ruwanta commented 3 years ago
  1. Yes, multi-tenant solution stable the last WSO2 IS version. But container based multi-tenant solution may not a best fit, as that image is optimised for single tenant case. You need to use vanilla pack and make your own own container images which suites your multi-tenant requirement
  2. You will not need WSO2 subscription to download and run the free docker images. But you may not receive the updates on them
  3. You can create your own docker image using vanilla pack, without subscription.

Better to join Slack channel and get community help for this kind of questions. https://wso2.com/community/slack/

rksk commented 2 years ago

I noticed that in master the check has been added:

        if (ArrayUtils.isEmpty(tokens)) {
            if (log.isDebugEnabled()) {
                log.debug("No tokens to revoke in batch mode. Therefore not continuing further in revocation.");
            }
            return;
        }

I guess it's useless I add it now :)

Just for the record, this is fixed with https://github.com/wso2/product-is/issues/9082