unoconv / unoserver

MIT License
592 stars 81 forks source link

Possible memory leak #129

Closed adamantal closed 2 months ago

adamantal commented 3 months ago

Hey!

First of all, thank you guys for this great project! We've just recently started using it, and it just fits right to our use cases 👌

Description

Converting multiple documents over time accumulates memory and eventually results in an out of memory error.

Context

We hit https://github.com/unoconv/unoserver/issues/108, and tried to tackle the issue, but did not have success. We're unoserver in Kubernetes, so we've added a cloud-native solution: added a liveness probe of checking whether the unoserver is able to convert a very minimalistic, 1-page PDF document to PNG. This conversion usually finishes <1s, so it's ideal to check it periodically and restart the server if it does not respond within a given timeframe.

We experienced is that unoserver accumulates memory over time, because of these small, periodic conversions. See attached screenshot above showing the memory usage over time in Grafana:

Screenshot 2024-08-24 at 14 21 50

(note that there are some sudden jumps that are due to some conversions - we do a few dozens daily. as you can see, these are also kept in the memory)

Steps to reproduce

I think a bash script like this would be able to reproduce this:

#!/bin/bash

while true; do
    unoconvert \
        --convert-to \
        "pdf" \
        --host \
        $UNOSERVER_ADDRESS \
        --port \
        $UNOSERVER_PORT \
        --host-location \
        "local" \
        $TEST_FILE \
        $OUTPUT_FILE
    rm -f $OUTPUT_FILE
    sleep 10
done

Other notes

I'm not super knowledgeable around the XMLRPC implementation in Python, but IMO it's unlikely that the memory leak is caused by the caller side (e.g. interrupting the unoconvert call).

After taking a look at the code, I could not effectively locate the root cause of this. My two suspicions are either due to some kind of caching or keeping the input in the memory at XMLRPC side or it could be caused by the libreoffice process directly. If it's the latter, we'll workaround it somewhat with a HA setup, but if it's the former, it might be a bug with unoserver itself, and you should be aware of that.

Also any advise or suggestion on the configuration is appreciated.

regebro commented 3 months ago

Python memory is generally good and we don't have any variables that could accumulate data, so I doubt it's unoserver, but rather in Libreoffice.

On Sat, Aug 24, 2024, 14:31 Adam Antal @.***> wrote:

Hey!

First of all, thank you guys for this great project! We've just recently started using it, and it just fits right to our use cases 👌 Description

Converting multiple documents over time accumulates memory and eventually results in an out of memory error. Context

We hit #108 https://github.com/unoconv/unoserver/issues/108, and tried to tackle the issue, but did not have success. We're unoserver in Kubernetes, so we've added a cloud-native solution: added a liveness probe of checking whether the unoserver is able to convert a very minimalistic, 1-page PDF document to PNG. This conversion usually finishes <1s, so it's ideal to check it periodically and restart the server if it does not respond within a given timeframe.

We experienced is that unoserver accumulates memory over time, because of these small, periodic conversions. See attached screenshot above showing the memory usage over time in Grafana: Screenshot.2024-08-24.at.14.21.50.png (view on web) https://github.com/user-attachments/assets/67ffa0d6-70a1-43c7-a4fd-bfea185c64c0 (note that there are some sudden jumps that are due to some conversions - we do a few dozens daily. as you can see, these are also kept in the memory) Steps to reproduce

I think a bash script like this would be able to reproduce this:

!/bin/bash

while true; do unoconvert \ --convert-to \ "pdf" \ --host \ $UNOSERVER_ADDRESS \ --port \ $UNOSERVER_PORT \ --host-location \ "local" \ $TEST_FILE \ $OUTPUT_FILE rm -f $OUTPUT_FILE sleep 10done

Other notes

I'm not super knowledgeable around the XMLRPC implementation in Python, but IMO it's unlikely that the memory leak is caused by the caller side (e.g. interrupting the unoconvert call).

After taking a look at the code, I could not effectively locate the root cause of this. My two suspicions are either due to some kind of caching or keeping the input in the memory at XMLRPC side or it could be caused by the libreoffice process directly. If it's the latter, we'll workaround it somewhat with a HA setup, but if it's the former, it might be a bug with unoserver itself, and you should be aware of that.

Also any advise or suggestion on the configuration is appreciated.

— Reply to this email directly, view it on GitHub https://github.com/unoconv/unoserver/issues/129, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGIK5DYRHLMXGXOL7VTUJDZTB4LJAVCNFSM6AAAAABNBRZMF2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4DINJVGU2TSNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

adamantal commented 2 months ago

Some extra info, if it's helpful. The libreoffice subprocess doesn't seem to increase its memory consumption:

$ while true; do     ps -p 28 -o %mem,rss,comm;     sleep 10; done
%MEM   RSS COMMAND
 1.8 291672 soffice.bin
%MEM   RSS COMMAND
 1.8 291648 soffice.bin
%MEM   RSS COMMAND
 1.8 291648 soffice.bin
%MEM   RSS COMMAND
 1.8 291648 soffice.bin
%MEM   RSS COMMAND
 1.8 291636 soffice.bin

while the unoserver does:

$ while true; do     ps -p 7 -o %mem,rss,comm;     sleep 10; done
%MEM   RSS COMMAND
 2.1 336424 unoserver
%MEM   RSS COMMAND
 2.1 343600 unoserver
%MEM   RSS COMMAND
 2.1 343600 unoserver
%MEM   RSS COMMAND
 2.1 346236 unoserver
%MEM   RSS COMMAND
 2.1 348620 unoserver
%MEM   RSS COMMAND
 2.2 350068 unoserver
%MEM   RSS COMMAND
 2.2 350068 unoserver
%MEM   RSS COMMAND
 2.2 354836 unoserver

Throw a few heap allocation and performance tools, but only tracemalloc were able to give me something valueable - the top 3 provided by tracemalloc.take_snapshot().statistics('lineno'):

/usr/lib/python3/dist-packages/uno.py:507: size=34.0 MiB, count=782923, average=46 B
/usr/local/lib/python3.12/dist-packages/unoserver/converter.py:123: size=839 KiB, count=15346, average=56 B
/usr/local/lib/python3.12/dist-packages/unoserver/converter.py:125: size=776 KiB, count=15280, average=52 B

The first seems to be the issue there (I see that its memory is increasing upon each call). based on the source code however this seems to be a generic getter, so doesn't actually give us much. Probably a hanging reference to an uno object that is not picked up by the gc.

PierreCarceller commented 2 months ago

No update? I'm having the same kind of problem...

@adamantal What version of unoserver are you using?

regebro commented 2 months ago

Fixed in 2.2.2.

adamantal commented 1 month ago

sorry, forget to get back to this thread. we're on 2.1 currently, but will update to 2.2.2 and get back to you to confirm that the memory leak is solved.

regebro commented 1 month ago

2.2.2 may have a stability problem. I can't confirm that it's because of this change, but we did have some crashes, which are handled better in the various betas that I released this week. So testing 3.0b2 might be better.

adamantal commented 3 weeks ago

had some API incompatibility issues, so we've finally bumped to 2.2.2 - didn't experience much instability

can confirm that the memory leak is resolved. thank you very much 🙏