nextcloud / helm

A community maintained helm chart for deploying Nextcloud on Kubernetes.
GNU Affero General Public License v3.0
296 stars 258 forks source link

only table oc_migration being created while installing nextcloud on k8s using helm chart with postgresql-ha and pgpool enabled #436

Open mak241265 opened 10 months ago

mak241265 commented 10 months ago

Describe your Issue

Hello everyone.. we are using Nextcloud helm chart to install nextcloud on k8s..an external postgresql is installed via Postgres-HA bitnami helm chart..and a user called nextcloud has been given a full access to a database called nextcloud as well.several check has been done in order to make sure everything is okey with database.. when we install nextcloud it goes to a part of Initializign nextcloud and then provide some error about missing tables in postgress.intersting thing is only one table is being created which is called ocmigrations and that's why we are sure there is no problem due the connectivity between nextcloud and postgesql... another thing that is very wired is that sometime the installation progress complete without any problem and many table get created with oc* prefix. it is worth to mention that we disable liveness and readiness to give the service sufficient time to do it's initialization. many times we delete the whole service and install it again but many time we faced the same problem.

Logs and Errors

Starting nextcloud installation
Error while trying to initialise the database: An exception occurred while executing a query: SQLSTATE[42P01]: Undefined table: 7 ERROR:  relation "oc_migrations" does not exist
LINE 1: SELECT "version" FROM "oc_migrations" WHERE "app" = $1 ORDER...
                              ^
Trace: #0 /var/www/html/3rdparty/doctrine/dbal/src/Connection.php(1814): Doctrine\DBAL\Driver\API\PostgreSQL\ExceptionConverter->convert(Object(Doctrine\DBAL\Driver\PDO\Exception), Object(Doctrine\DBAL\Query))
#1 /var/www/html/3rdparty/doctrine/dbal/src/Connection.php(1749): Doctrine\DBAL\Connection->handleDriverException(Object(Doctrine\DBAL\Driver\PDO\Exception), Object(Doctrine\DBAL\Query))
#2 /var/www/html/3rdparty/doctrine/dbal/src/Connection.php(1055): Doctrine\DBAL\Connection->convertExceptionDuringQuery(Object(Doctrine\DBAL\Driver\PDO\Exception), 'SELECT "version...', Array, Array)
#3 /var/www/html/lib/private/DB/Connection.php(262): Doctrine\DBAL\Connection->executeQuery('SELECT "version...', Array, Array, NULL)
#4 /var/www/html/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php(345): OC\DB\Connection->executeQuery('SELECT "version...', Array, Array)
#5 /var/www/html/lib/private/DB/QueryBuilder/QueryBuilder.php(280): Doctrine\DBAL\Query\QueryBuilder->execute()
#6 /var/www/html/lib/private/DB/MigrationService.php(191): OC\DB\QueryBuilder\QueryBuilder->execute()
#7 /var/www/html/lib/private/DB/MigrationService.php(256): OC\DB\MigrationService->getMigratedVersions()
#8 /var/www/html/lib/private/DB/MigrationService.php(434): OC\DB\MigrationService->getMigrationsToExecute('latest')
#9 /var/www/html/lib/private/DB/MigrationService.php(409): OC\DB\MigrationService->migrateSchemaOnly('latest')
#10 /var/www/html/lib/private/Setup/AbstractDatabase.php(158): OC\DB\MigrationService->migrate('latest', true)
#11 /var/www/html/lib/private/Setup.php(371): OC\Setup\AbstractDatabase->runMigrations()
#12 /var/www/html/core/Command/Maintenance/Install.php(104): OC\Setup->install(Array)
#13 /var/www/html/3rdparty/symfony/console/Command/Command.php(298): OC\Core\Command\Maintenance\Install->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#14 /var/www/html/3rdparty/symfony/console/Application.php(1040): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#15 /var/www/html/3rdparty/symfony/console/Application.php(301): Symfony\Component\Console\Application->doRunCommand(Object(OC\Core\Command\Maintenance\Install), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#16 /var/www/html/3rdparty/symfony/console/Application.php(171): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#17 /var/www/html/lib/private/Console/Application.php(211): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#18 /var/www/html/console.php(100): OC\Console\Application->run()
#19 /var/www/html/occ(11): require_once('/var/www/html/c...')
#20 {main}
Error while trying to initialise the database: An exception occurred while executing a query: SQLSTATE[42P01]: Undefined table: 7 ERROR:  relation "oc_appconfig" does not exist
LINE 1: SELECT * FROM "oc_appconfig"
                      ^
Trace: #0 /var/www/html/3rdparty/doctrine/dbal/src/Connection.php(1814): Doctrine\DBAL\Driver\API\PostgreSQL\ExceptionConverter->convert(Object(Doctrine\DBAL\Driver\PDO\Exception), Object(Doctrine\DBAL\Query))
#1 /var/www/html/3rdparty/doctrine/dbal/src/Connection.php(1749): Doctrine\DBAL\Connection->handleDriverException(Object(Doctrine\DBAL\Driver\PDO\Exception), Object(Doctrine\DBAL\Query))
#2 /var/www/html/3rdparty/doctrine/dbal/src/Connection.php(1055): Doctrine\DBAL\Connection->convertExceptionDuringQuery(Object(Doctrine\DBAL\Driver\PDO\Exception), 'SELECT * FROM "...', Array, Array)
#3 /var/www/html/lib/private/DB/Connection.php(262): Doctrine\DBAL\Connection->executeQuery('SELECT * FROM "...', Array, Array, NULL)
#4 /var/www/html/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php(345): OC\DB\Connection->executeQuery('SELECT * FROM "...', Array, Array)
#5 /var/www/html/lib/private/DB/QueryBuilder/QueryBuilder.php(280): Doctrine\DBAL\Query\QueryBuilder->execute()
#6 /var/www/html/lib/private/AppConfig.php(418): OC\DB\QueryBuilder\QueryBuilder->execute()
#7 /var/www/html/lib/private/AppConfig.php(226): OC\AppConfig->loadConfigValues()
#8 /var/www/html/lib/private/AllConfig.php(217): OC\AppConfig->getValue('core', 'vendor', '')
#9 /var/www/html/lib/private/DB/MigrationService.php(119): OC\AllConfig->getAppValue('core', 'vendor', '')
#10 /var/www/html/lib/private/DB/MigrationService.php(183): OC\DB\MigrationService->createMigrationTable()
#11 /var/www/html/lib/private/DB/MigrationService.php(256): OC\DB\MigrationService->getMigratedVersions()
#12 /var/www/html/lib/private/DB/MigrationService.php(434): OC\DB\MigrationService->getMigrationsToExecute('latest')
#13 /var/www/html/lib/private/DB/MigrationService.php(409): OC\DB\MigrationService->migrateSchemaOnly('latest')
#14 /var/www/html/lib/private/Setup/AbstractDatabase.php(158): OC\DB\MigrationService->migrate('latest', true)
#15 /var/www/html/lib/private/Setup.php(371): OC\Setup\AbstractDatabase->runMigrations()
#16 /var/www/html/core/Command/Maintenance/Install.php(104): OC\Setup->install(Array)
#17 /var/www/html/3rdparty/symfony/console/Command/Command.php(298): OC\Core\Command\Maintenance\Install->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#18 /var/www/html/3rdparty/symfony/console/Application.php(1040): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#19 /var/www/html/3rdparty/symfony/console/Application.php(301): Symfony\Component\Console\Application->doRunCommand(Object(OC\Core\Command\Maintenance\Install), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#20 /var/www/html/3rdparty/symfony/console/Application.php(171): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#21 /var/www/html/lib/private/Console/Application.php(211): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#22 /var/www/html/console.php(100): OC\Console\Application->run()

Describe your Environment

jessebot commented 10 months ago

Thanks for submitting an issue :)

  1. Could you please post your values.yaml that you're using for this installation? Make sure to anonymize any sensitive data before doing so. I want to check all of your database related settings (both internalDatabase and externalDatabase please) as well as your persistence settings.

  2. The external postgres database you are setting up, does it have persistence enabled?

  3. Instead of disabling the readiness and liveness probes, could you instead set them to be delayed and slower to check for liveness/readiness like this for example:

# this sets the liveness and readiness probe to be delayed by 2 minutes 
# and then it waits 20 seconds before checking again, 10 seconds before declaring a try failed, 
# and it will try 6 times before declaring the setup failed
livenessProbe:
  enabled: true
  initialDelaySeconds: 120
  periodSeconds: 20
  timeoutSeconds: 10
  failureThreshold: 6
  successThreshold: 1

readinessProbe:
  enabled: true
  initialDelaySeconds: 120
  periodSeconds: 20
  timeoutSeconds: 10
  failureThreshold: 6
  successThreshold: 1

Depending on your environment, you may have to set these to be longer delays or more retries, but this is an ok starting point.

mak241265 commented 10 months ago

Thanks for quick response @jessebot i did what you said and i tested twice first time successful second time unsuccessful -> Previous: PDOException: SQLSTATE[42P01]: Undefined table: 7 ERROR: relation "oc_appconfig" does not exist │ │ LINE 1: SELECT * FROM "oc_appconfig"

# internalDatabase config in values.yaml:
internalDatabase:
  enabled: false
  name: nextcloud

# externalDatabase config in values.yaml:

externalDatabase:
  enabled: true

  type: postgresql

  host: pg-postgresql-ha-pgpool.pg.svc.cluster.local

  user: nextcloud

  password: mypass

  database: nextcloud

  existingSecret:
    enabled: false
    # secretName: key1
    # usernameKey: key2
    # passwordKey: key3
    # hostKey: key4
    # databaseKey: key5

##pg-postgresql-ha-pgpool.pg.svc.cluster.local
## MariaDB chart configuration
## ref: https://github.com/bitnami/charts/tree/main/bitnami/mariadb
##
mariadb:  
  # to true  
  enabled: false

  auth:
    database: nextcloud
    username: nextcloud
    password: changeme
    existingSecret: ""

  architecture: standalone

  primary:
    persistence:
      enabled: false
      # Use an existing Persistent Volume Claim (must be created ahead of time)
      # existingClaim: ""
      # storageClass: ""
      accessMode: ReadWriteOnce
      size: 8Gi

postgresql:
  enabled: false
  global:
    postgresql:
      # global.postgresql.auth overrides postgresql.auth
      auth:
        username: nextcloud
        password: changeme
        database: nextcloud    
        existingSecret: ""
         secretKeys:
          adminPasswordKey: ""
          userPasswordKey: ""
          replicationPasswordKey: ""
  primary:
    persistence:
      enabled: false
      # Use an existing Persistent Volume Claim (must be created ahead of time)
      #existingClaim: "nextcloud-postgresql-pvc"
      # storageClass: ""

-- the way we create out nextcloud user and database:

CREATE USER nextcloud WITH PASSWORD 'mypass' SUPERUSER;
CREATE DATABASE nextcloud;
GRANT CONNECT ON DATABASE nextcloud TO nextcloud;
GRANT ALL PRIVILEGES ON DATABASE nextcloud to nextcloud;
GRANT ALL PRIVILEGES ON SCHEMA public TO nextcloud;
ALTER DATABASE nextcloud OWNER TO nextcloud;
jessebot commented 10 months ago

No problem, trying to take a look now.

Could I please ask that you edit your comment to use a code block with syntax highlighting for the values.yaml? You can do so by putting this on the line before the first line of your values.yaml: ```yaml

and then putting this on the line after your last line of your values.yaml: ```

This will render it a bit easier to read on Github like this:

  nextcloud:
    host: example.com
jessebot commented 10 months ago

not sure you should keep this, but for testing, can you try running this for your nextcloud database server?

GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO nextcloud; 

It's one of the things I set when I'm debugging database issues.

mak241265 commented 10 months ago

@jessebot same problem still...the thing that is really annoying is sometimes the installation work and all tables are being created and sometimes not

jessebot commented 10 months ago

:thinking: Typically that's a liveness/readiness probe issue or a persistence issue. Can you answer the following:

  1. Are you using persistent volumes with the bitnami postgres-ha chart? If you are, are you deleting them before each reinstall? And what storage class are you using?

  2. Are you using persistent volumes for your nextcloud data? If you are, are you deleting them before reinstall? And what storage class are you using?

  3. Are you setting any resource limits for memory/cpu? If so, can you verify the pod is not running up against those limits?

  4. Can you try with an even higher liveness/readiness probe delay? The reason for this is depending on your server's specs (memory/cpu) as well as your internet speed, the amount of time till the pod is ready/live can differ. For some users, if they run low spec hardware and/or their internet speed is patchy or slower than average, they may need to increase their readiness/liveness probes to higher amounts of time, for instance, an initial delay may need to be 5 minutes for some users, or in some extreme cases, 10 minutes. Perhaps trying 5 minutes (which would initialDelaySeconds: 300 on both probes)?

mak241265 commented 10 months ago

🤔 Typically that's a liveness/readiness probe issue or a persistence issue. Can you answer the following:

  1. Are you using persistent volumes with the bitnami postgres-ha chart? If you are, are you deleting them before each reinstall? And what storage class are you using?
  2. Are you using persistent volumes for your nextcloud data? If you are, are you deleting them before reinstall? And what storage class are you using?
  3. Are you setting any resource limits for memory/cpu? If so, can you verify the pod is not running up against those limits?
  4. Can you try with an even higher liveness/readiness probe delay? The reason for this is depending on your server's specs (memory/cpu) as well as your internet speed, the amount of time till the pod is ready/live can differ. For some users, if they run low spec hardware and/or their internet speed is patchy or slower than average, they may need to increase their readiness/liveness probes to higher amounts of time, for instance, an initial delay may need to be 5 minutes for some users, or in some extreme cases, 10 minutes. Perhaps trying 5 minutes (which would initialDelaySeconds: 300 on both probes)?

In order to make sure persistent volume is not a problem i disabled persistence on both nextcloud and postgresql.anyway we use nfs right now for testing purpose.. The resource limit is disabled... i increased the initialDelaySeconds but no change..problem still exist(sometimes the installation is successful sometimes not) i do not think this is related to liveness/readiness because the below problem occure so fast:

 Initializing nextcloud 27.0.0.8 ...                                                        │
│ New nextcloud instance                                                                     │
│ Installing with PostgreSQL database                                                        │
│ => Searching for scripts (*.sh) to run, located in the folder: /docker-entrypoint-hooks.d/ │
│ ==> but the hook folder "pre-installation" is empty, so nothing to do                      │
│ Starting nextcloud installation                                                            │
│ Error while trying to initialise the database: An exception occurred while executing a que │
│ LINE 1: SELECT "version" FROM "oc_migrations" WHERE "app" = $1 ORDER... ...........
Trace: #0 /var/www/html/3rdparty/doctrine/dbal/src/Connection.php(1814): Doctrine\DBAL\Dri │
│ #1 /var/www/html/3rdparty/doctrine/dbal/src/Connection.php(1749): Doctrine\DBAL\Connection │
│ #2 /var/www/html/3rdparty/doctrine/dbal/src/Connection.php(1055): Doctrine\DBAL\Connection │
│ #3 /var/www/html/lib/private/DB/Connection.php(262): Doctrine\DBAL\Connection->executeQuer │
│ #4 /var/www/html/3rdparty/doctrine/dbal/src/Query/QueryBuilder.php(345): OC\DB\Connection- │...............

The only table that being created with out problem:

 Schema |     Name      | Type  |   Owner   
--------+---------------+-------+-----------
 public | oc_migrations | table | nextcloud
(1 row)
mak241265 commented 10 months ago

i'm not sure...but i think the problem is the delay of creation of "oc_migrations"..it is just a guess

mak241265 commented 10 months ago

@jessebot i made some progress....i think the problem is the connection to postgresql via pgpool service. below are the services postgresql-ha bitnami created:

pg-postgresql-ha-pgpool                ClusterIP   10.43.136.118   <none>        5432/TCP   16m
pg-postgresql-ha-postgresql            ClusterIP   10.43.7.80      <none>        5432/TCP   16m
pg-postgresql-ha-postgresql-headless   ClusterIP   None            <none>        5432/TCP   16m
pg-postgresql-ha-postgresql-metrics    ClusterIP   10.43.2.7       <none>        9187/TCP   16m

instead of using pg-postgresql-ha-pgpool svc i used headless svc (pg-postgresql-ha-postgresql-0.pg-postgresql-ha-postgresql-headless.pg.svc.cluster.local)...and we did many test without any problem.but we defiantly need to use pgpool service as it provide more functionality.as i said in my previous comment i think there is small delay when using pgpool service compare using postgesql svc directly and nextcloud does not tolerate this delay. Any solution?

mak241265 commented 10 months ago

after several test we face the error again

provokateurin commented 10 months ago

@mak241265 may I ask who "we" is? If you are talking about a company you should checkout the commercial support that Nextcloud GmbH provides. This project is run by the community and not an official way to install Nextcloud.

mak241265 commented 10 months ago

@mak241265 may I ask who "we" is? If you are talking about a company you should checkout the commercial support that Nextcloud GmbH provides. This project is run by the community and not an official way to install Nextcloud.

well right now we are just testing this project for test...i checked on internet many people face this problem as well

jessebot commented 10 months ago

As Kate points out, the official ways to install Nextcloud don't include this helm chart at this time, in part due to the fact that the docker image (nextcloud/docker) is also community created and maintained. I am a volunteer and can only help when I have some spare time.

That being said, if I understand correctly:

postgresl-ha isn't supported in this helm chart right now, but as a guess, can you turn the pgpool feature on after you initialize nextcloud? :thinking:

If you can't do that, then perhaps you'd need to dig into the occ maintenance:install command that nextcloud/docker runs: https://github.com/nextcloud/docker/blob/f9ae675c1ac2aed735435e84dd1794eb28890103/27/fpm/entrypoint.sh#L215-L228

which doesn't include any sort of actual calls to postgres itself in the docker container, so this would require you to dive into the nextcloud/server repo and check how the database initialization is done there.

mak241265 commented 10 months ago

As Kate points out, the official ways to install Nextcloud don't include this helm chart at this time, in part due to the fact that the docker image (nextcloud/docker) is also community created and maintained. I am a volunteer and can only help when I have some spare time.

That being said, if I understand correctly:

  • this happens when you don't use persistence at all, so it's not that.
  • this only happens when you use pgpool with postgresql-ha

postgresl-ha isn't supported in this helm chart right now, but as a guess, can you turn the pgpool feature on after you initialize nextcloud? 🤔

If you can't do that, then perhaps you'd need to dig into the occ maintenance:install command that nextcloud/docker runs: https://github.com/nextcloud/docker/blob/f9ae675c1ac2aed735435e84dd1794eb28890103/27/fpm/entrypoint.sh#L215-L228

which doesn't include any sort of actual calls to postgres itself in the docker container, so this would require you to dive into the nextcloud/server repo and check how the database initialization is done there.

Thanks for your comment..So what are the best ways to install nextcloud for production use? well the helm chart support external database and we used postgrsql-ha,,do you mean is it wrong?shall we use single postgresql?

this happens when you don't use persistence at all, so it's not that. => there is no exact answer to this as we randomly face problem either we use persistence nor not.

jessebot commented 10 months ago

So what are the best ways to install nextcloud for production use?

I'm not a member of Nextcloud GmbH, but I am not sure if they have a production supported helm chart at this time. @provokateurin could comment further, but I think for containerized, the officially supported container is nextcloud/all-in-one. I'm basing that educated guess on the note in the README here that says:

⚠ This image is not officially supported by Nextcloud GmbH, use at your own risk. Use the All-in-One docker image for easier deployment.

To answer your final question:

well the helm chart support external database and we used postgrsql-ha,,do you mean is it wrong?shall we use single postgresql?

I just meant that this chart is specifically tested with the bitnami postgresql chart, instead of postgresql-ha. I still agree we should support postgresql-ha, but at the moment, to my knowledge, only postgresql is tested regularly as it's baked into this chart as a sub chart here:

https://github.com/nextcloud/helm/blob/268defe1fff1af02a83fe2f1f9d22e9a71e0f797/charts/nextcloud/Chart.yaml#L24-L28

If you or any other user in the community is able to dedicate the time to debug postgresql-ha working with nextcloud, we'd happily review a PR here, but it seems like the current issue isn't with this chart, so much as it's with nextcloud itself, which would require you to dig into the server code as mentioned in https://github.com/nextcloud/helm/issues/436#issuecomment-1704292587

If any other community member has made this work with postgresql-ha, we'd love to hear how you did it. but until then, seeking out a support contract with Nextcloud GmbH may be your best way forward.

provokateurin commented 10 months ago

@mak241265 For a commercial Helm Chart you can contact sales@nextcloud.com. I'm not involved in it so I can't give you any details about it. Other than that AIO which Jesse mentioned or the classic way are supported AFAIK.

jessebot commented 10 months ago

I wonder if this is actually a similar issue as https://github.com/nextcloud/helm/issues/308

That issue also pointed to the nextcloud/server repo as the source of this issue.

mak241265 commented 10 months ago

@jessebot i checked #308 and i think this is not related to our issue...their problem is related to postgresql-ha bitnami missconfiguration.

jessebot commented 10 months ago

Sorry about that, but thanks for chiming in on the other issue! Really appreciate that.

Also the comment I deleted was just a duplicate comment.

hachh commented 10 months ago

I encountered the same problem today, after digging into the problem I found that it works if I reduce pgpool to 1 replica.

I guess the nextcloud installation doesn't use sql transactions at all and falls into sql concurrency issues when there are two pgpool replicas as the two pgpool instances don't necessarily commit in the order nextcloud wants....

mak241265 commented 10 months ago

Hello @hachh..Thanks for replay..the default pgpool is one instance when you use postgresql-ha bitnami and i did not increased the replicas of pgpool...

hachh commented 10 months ago

You're right, I've also temporarily reduced the posgresql replica to 1

andrew-aiken commented 4 months ago

Any updates with this?

I have been trying to get this to work myself. I found that nextcloud will create the oc_admin user which is not allowed to connect over pgpool so it then fails, and then try again... oc_admin14. What currently works for me is to have it connect directly to the db ☹️