alcionai / corso

Free, Secure, and Open-Source Backup for Microsoft 365
https://corsobackup.io
Apache License 2.0
186 stars 41 forks source link

[Bug]: corsobackup eats an incredible amount of RAM on k8s and runs into 48GB limit #4512

Closed terminar closed 1 year ago

terminar commented 1 year ago

What happened?

Using corsobackup container image/Corso 0.14.0 on kubernetes with limits set to 32GB or even 48GB RAM, doing backup of some 2Y+ Exchange MS365 Accounts and some - maybe bigger - sharepoint sites runs into OOM killer because of using all available memory and then - get's killed by the linux kernel.

The problem occurs on the first full run of the backup, i was not able to backup several exchange accounts with only 32GB available.

Are such problems known?

Corso Version?

Corso v0.14.0

Where are you running Corso?

Running on Kubernetes 1.20, Linux 5.10.19-200.fc33.x86_64 Storage: minio

Relevant log output

No response

ntolia commented 1 year ago

@terminar A few questions:

We are also on Discord if that would be easier for you.

terminar commented 1 year ago

Are you running multiple backups in parallel within the container or only doing a single backup at a time?

Single line, single backup. Script below.

For backups that are running out of memory Can you share the size of the mailbox (available via the Exchange admin UI) and the number of items in it?

testuser@somedomain.com: 5,63GB / around 100k mails testuser2@somedomain.com: 7,94GB / maybe also around 100k mails

Can you share the number of files and size of the SharePoint sites?

somesite: 37GB / 9k files somesite2: 55GB / 24k files somesite3: 84GB / 75k files somesite4: 156GB / 3k files

What is the exact command you are using? In particular, are you specifying a particular site or mailbox or are you using the * pattern to backup all mailboxes/sites.

Particular site/mailbox in a bash for-loop.

Script example

          echo "corso repo init..."
          /corso repo init s3 --bucket "$BUCKET" --disable-tls --endpoint minio

          echo "corso repo connect..."
          /corso repo connect s3 --bucket "$BUCKET" --disable-tls --endpoint minio

          USERS="testuser@somedomain.com testuser2@somedomain.com"
          for USER in $USERS ; do
            echo "Exchange backup - $USER"
            /corso --hide-progress backup create exchange --mailbox "$USER" || { echo "backup error > exchange: $USER"; exit 1; }
            echo ""

            echo "OneDrive backup - $USER"
            /corso --hide-progress backup create onedrive --user "$USER" || { echo "backup error > onedrive: $USER"; exit 1; }
            echo ""
          done

          echo "===> Finished Exchange + OneDrive sync"

          SPSITES="somesite somesite2 somesite3 somesite4"
          for SITE in $SPSITES ; do
            echo "Sharepoint backup - $SITE"
            /corso --hide-progress backup create sharepoint --site "https://somedomain.sharepoint.com/sites/$SITE" || { echo "backup error > sharepoint: $SITE" ; exit 1; }
          done
          echo "===> Finished Sharepoint sync"
          sync
          echo "Sleeping 60s"
          sleep 60
          echo "Done."
ntolia commented 1 year ago

Thanks @terminar!

For others who might be subscribed to this issue, the conversation did move to Discord and it is being investigated there.

ntolia commented 1 year ago

Some updates: