inovex / CalendarSync

Stateless CLI tool to sync calendars across different calendaring systems.
MIT License
110 stars 14 forks source link

CalendarSync has quite a large memory footprint #180

Open frittentheke opened 3 months ago

frittentheke commented 3 months ago

I looked a little closer at the output of systemd and noticed the large amount of memory CalendarSync apparently uses: It is fluctuating around 300 to < 600MB with the current 0.10.0 release.

Aug 08 15:12:47 : CalendarSync.service: Consumed 711ms CPU time, 301.4M memory peak.
Aug 09 07:55:08 : CalendarSync.service: Consumed 1.647s CPU time, 546.3M memory peak.
Aug 09 08:25:07 : CalendarSync.service: Consumed 594ms CPU time, 306.2M memory peak.
Aug 09 08:55:07 : CalendarSync.service: Consumed 795ms CPU time, 296.9M memory peak.
Aug 09 09:25:31 : CalendarSync.service: Consumed 758ms CPU time, 318.8M memory peak.
Aug 09 09:55:07 : CalendarSync.service: Consumed 698ms CPU time, 294.7M memory peak.
Aug 09 10:25:07 : CalendarSync.service: Consumed 612ms CPU time, 296.6M memory peak.
Aug 09 10:55:08 : CalendarSync.service: Consumed 1.822s CPU time, 540.2M memory peak.
Aug 09 11:25:08 : CalendarSync.service: Consumed 576ms CPU time, 294.1M memory peak.
Aug 09 11:55:07 : CalendarSync.service: Consumed 567ms CPU time, 294.4M memory peak.
Aug 09 12:25:07 : CalendarSync.service: Consumed 569ms CPU time, 294M memory peak.
Aug 09 12:55:08 : CalendarSync.service: Consumed 562ms CPU time, 296.6M memory peak.
Aug 09 13:25:08 : CalendarSync.service: Consumed 733ms CPU time, 294.2M memory peak.
Aug 09 13:55:08 : CalendarSync.service: Consumed 1.954s CPU time, 541.9M memory peak.
Aug 09 14:25:09 : CalendarSync.service: Consumed 788ms CPU time, 293.9M memory peak.
Aug 12 08:00:23 : CalendarSync.service: Consumed 3.388s CPU time, 552.4M memory peak.

During validation of https://github.com/inovex/CalendarSync/pull/177 I noticed an even bigger memory footprint, especially for the first run:

CalendarSync.service: Consumed 10.940s CPU time, 1.5G memory peak.

But later runs appear to be using more memory than before: destination):

CalendarSync.service: Consumed 3.885s CPU time, 954.7M memory peak.

This is without any changes to the source (or updates being pushed to the sink).

Maybe the cause therefore lies somewhere in the use of the https://github.com/microcosm-cc/bluemonday sanitizer? There was a change allowing bluemonday to stream its output to a writer directly: https://github.com/microcosm-cc/bluemonday/pull/110. Maybe that would help some?

Let me now if you want me to run some pprof or to provide any other debug info.

^^ @alxndr13 FYI

MichaelEischer commented 1 month ago

I did a bit of memory profiling. The memory usage peak is probably mostly caused by https://github.com/inovex/CalendarSync/blob/ccdebc534f875ae0c3a6aa2fd3ee185a553e3ea8/internal/auth/storage_encryption.go#L38 . At least on my machine that allocate a whopping 256MB of memory.

The higher memory usage with the non-release builds is probably caused by https://github.com/inovex/CalendarSync/pull/195 .

Bluemonday doesn't use particularly much memory. And as the transformers are called sequentially, this allows go to run GC during the transformations if necessary. https://github.com/inovex/CalendarSync/pull/197 should reduce the peak memory usage by running a GC after the storage decryption is completed. That should at least limit the memory peak to around 300MB.

MichaelEischer commented 1 month ago

Btw, I've used the following patch for memory profiling (run go mod tidy to get the library)

diff --git a/cmd/calendarsync/main.go b/cmd/calendarsync/main.go
index 3cb4ac6..de2b3cc 100644
--- a/cmd/calendarsync/main.go
+++ b/cmd/calendarsync/main.go
@@ -10,6 +10,7 @@ import (
        "github.com/inovex/CalendarSync/internal/models"

        "github.com/charmbracelet/log"
+       "github.com/pkg/profile"
        "github.com/urfave/cli/v2"

        "github.com/inovex/CalendarSync/internal/adapter"
@@ -103,6 +104,8 @@ func main() {
 }

 func Run(c *cli.Context) error {
+       defer profile.Start(profile.MemProfile, profile.MemProfileRate(1)).Stop()
+
        if c.Bool(flagVersion) {
                fmt.Println("Version:", Version)
                os.Exit(0)
@@ -211,5 +214,8 @@ func Run(c *cli.Context) error {
                        log.Fatalf("we had some errors during synchronization:\n%v", err)
                }
        }
+
+       // this looks weird, but without it the profile was incomplete
+       runtime.GC()
+       time.Sleep(1 * time.Second)
        return nil
 }