Open sergei-maertens opened 3 years ago
Memory usage graph from our deployment on k8s:
Hey @sergei-maertens,
so much for relying on the python's GC for cleaning stuff up. Since we do not use any C modules, there is not really a free()
to call :smile:
I tried to replicate with your instructions and saw some increase, but nothing so big as 500mb of dangling memory. We do make a lot of calls into DRF and Django internals, but nothing problematic sticks out yet. They do make a fair amount of use of WeakRef
though. I suppose we need to find the references that block the GC from cleaning up and convert those to WeakRef
s. That would be my initial guess. We definitely need to break this down further and get an understanding where the leakage is coming from.
Sidenote:
I thought about caching the response before, but there has not been a demand for it yet. In the basic case, the schema is static per server instance, but if you use the 'SERVE_PUBLIC' : False
(partial schema for which you have access) or i18n
features, it gets more complicated.
How much does the memory footprint increase per API call?
I'm observing between 0.5 and 5MB - weird thing is that's it not consistent between calls. We have of course base memory usage, app-startup sits at around ~300MB and over time it just increments until the memory limit is hit.
One example of an API schema can be found here: https://github.com/open-formulieren/open-forms/blob/master/src/openapi.yaml. Can I send you a link somewhere privately so you can see the API schema yourself without me having to paste it here publicly?
We definitely need to break this down further and get an understanding where the leakage is coming from.
That I agree with a 100%! I mostly wanted to get it reported in case other people also experience issues and as a self-reminder.
I thought about caching the response before, but there has not been a demand for it yet. In the basic case, the schema is static per server instance, but if you use the 'SERVE_PUBLIC' : False (partial schema for which you have access) or i18n features, it gets more complicated.
Eh, at the view level we can leverage django.core.cache
around it, so it's not necessarily a library-feature that's needed at the moment.
I'm observing between 0.5 and 5MB
I saw something in that range, but it also did not behave linearly.
Can I send you a link somewhere privately so you can see the API schema yourself without me having to paste it here publicly?
Thanks, but not necessary yet. Just wanted to get a feel for how large it is. I would call open-forms
mid-sized.
I mostly wanted to get it reported in case other people also experience issues and as a self-reminder.
:+1: Let me know if you find anything pointing into a specific direction! I will dig deeper when I can alot some time, but any help is appreciated.
@sergei-maertens Was this issue encountered when using Python 3.10? If so, 3.10.2 was recently released that addressed a memory leak I think could significantly affect drf-spectacular.
Links for info:
It'd be interesting to see if you can generate some details again from before (3.10.1) and after (3.10.2) the fix.
Unfortunately it's not that simple. This is seen on Python 3.8 and Python 3.9 :(
I'm having some issues with max recursion depth with postprocessing hooks (after calling the schema endpoint several times), this may be related to your issue. Could you maybe try to disable the hooks, call the schema endpoint a few times and observe whether it still memory leaks?
I'll open a separate issue on my issue within the next few days, but I still need to figure out a minimal reproducable example for my issue
I'm having some issues with max recursion depth with postprocessing hooks
@StopMotionCuber any insight would be appreciated! Though I have trouble understanding where it could possibly happen in the postprocessing. the postprocessing framework is conceptually super simple and not much magic is going on there. Of course I cannot judge whether you do something funky in your custom hooks.
The default postprocess_schema_enums
- although complicated - only performs basic iterative operations and is largely self-contained. There is a recursion in there but it only traverses the schema tree once. So unless you have 100 levels deep oneOf
structures, which would be very unlikely, this should never happen.
@sergei-maertens are you by any chance using the rollup blueprint?
I don't think so, but we are doing something similar with other flavors of polymorphism, we're definitely resolving components!
On Mon, 28 Mar 2022, 23:59 T. Franzel, @.***> wrote:
@sergei-maertens https://github.com/sergei-maertens are you by any chance using the rollup blueprint https://github.com/tfranzel/drf-spectacular/blob/master/docs/blueprints/rollup.py ?
— Reply to this email directly, view it on GitHub https://github.com/tfranzel/drf-spectacular/issues/597#issuecomment-1081194931, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKDJVU5KXGXMJE3GS5QB5DVCITTPANCNFSM5HVNERAQ . You are receiving this because you were mentioned.Message ID: @.***>
there might be a leak there but since you are not using it nevermind.... "these are not the droids you are looking for" :smile:
Describe the bug
Using the
SpectacularJSONAPIView
(https://github.com/open-formulieren/open-forms/blob/master/src/openforms/api/urls.py#L59), we are observing a memory leak. Memory of the process keeps growing every time the schema endpoint is hit.We noticed this as the api schema endpoint was configured as Kubernetes pod health check, and the container was getting OOM-killed in a very regular pattern. Happens with two of our apps both on drf-spectacular 0.17.2 and confirmed that 0.20.2 does not fix it (yet).
To Reproduce
SpectacularJSONAPIView
manage.py runserver
works, but also with uwsgi this has been reproduced)top
for that process:top -p <PID>
& hite
to get the VIRT/RES memory in megabytes - use a separate shell/tab for thiscurl http://localhost:8000/api/v1/
I'll see if I can find the time to set up a minimal reproducing project without any extra dependencies.
Additionally, I did some debugging with
mem_top
package (taken from https://github.com/GemeenteUtrecht/zaakafhandelcomponent/issues/490):and after doing a couple more curl requests you see the refs/bytes increase of drf-spectacular related datastructures:
Expected behavior
There should not be memory leaks.