Huge max_heap_size compared to the actual data size

IvanIvanoff commented 1 year ago

Environment

Elixir version (elixir -v): Elixir 1.15.0-dev (e0c1084) (compiled with Erlang/OTP 25)
Absinthe version: 1.7.1 . I've tested also with 1.6.8 and 1.5.5

Actual behavior

A few days ago our application started getting killed with Out of memory error. What we found is that for some queries, even if the size of the response is small, the memory consumed by the process that processes the GraphQL query is huge. When multiple such queries happen concurrently, the RAM usage jumps from 1Gb to over 6Gb. As a temporary solution we set the :max_heap_size flag in the resolver before processing the query, so these queries can be killed instead of killing the whole application.

Here is a project that reproduces the issue: https://github.com/IvanIvanoff/absinthe_memory

In this project, we observe that when the HTTP response is small in size at 828 kB, Absinthe peaks at over 50 Mb of RAM usage when processing it.

Expected behavior

The peak memory usage should not be so big.

benwilson512 commented 1 year ago

Happy to look into it. I suspect that using persistent term will help. Overall though for bulk data like timeseries data the fact the remains that there is going to be a fair bit over overhead, both in processing and memory, compared to raw JSON. Absinthe is going to type annotate and type check every single value and if you want to move thousands of values, that adds up.

benwilson512 commented 1 year ago

Overall I would suggest that you treat bulk data transfers a bit like you'd treat images or binary data in GraphQL: Link to them. We have a bulk data controller that takes a set of signed parameters and returns JSON. Then in our GraphQL we return links to those endpoints with the signed params embedded.

crayment commented 1 month ago

Just wanted to voice that we are also running into this issue. We would love to be able to return large numbers of records through GraphQL - but these massive memory spikes make that impossible and we are having to move to other solutions.

I don't understand the problem well enough to know if type annotation and checking is something that requires using as much memory as we are seeing. The sample app linked above using 50x the memory of the actual response seems excessive, but perhaps that's just what it is? Feels tempting to look for some memory wins here...

benwilson512 commented 1 month ago

Hey @crayment There is certainly no low hanging fruit here. For some thing as simple as a json response of {"hello":"world"} the key "hello" is just 5 bytes, but there is easily several hundred bytes worth of data structures holding on to the type information for casting, validation, and traceback information in case there is an error.

Are you using the persistent term backend? That'd certainly be the first place to start from a practical standpoint.

crayment commented 1 month ago

Thanks for the response @benwilson512.

Would you mind expanding a bit on the details of how you expect using persistent term would reduce memory usage? Reading the docs it seems to talk about wins at compile time. Just trying to wrap my head around the trade-offs here a bit. If you have any recommended resources would really appreciate them!

benwilson512 commented 1 month ago

Would you mind expanding a bit on the details of how you expect using persistent term would reduce memory usage?

Sure! To start with some stuff you probably already know, each BEAM process (hereafter, just 'process') has its own HEAP. If you send a value from process A to process B it is copied into the heap of process B.

This can pose some challenges for tools like Absinthe where you have potentially very large data-structures necessary to describe complex schemas, and you need those structures available in every HTTP request so that you can verify the incoming document. In a naive schema storage mechanism like :ets or a genserver, you'd have to copy the entire schema into each request, consuming a lot of memory.

The exception to this sort of thing is data that is found in what is called the "Constant pool". So if you have:

defmodule Constants do
  def mapping() do
    %{hello: "world", name: "Ben"}
  end
end

And you call Constants.mapping() in your code somewhere instead of copying that map value you just get basically a pointer to where it lives in the constant pool, and so it consumes essentially 0 memory in your processes.

Naturally then the "old school" way of solving this problem was to transform an Absinthe schema into gigantic constants via macros like you see above. The problem is that this has a lot of practical limitations. It is incredibly easy to have some sort of dynamic function call, or an anonymous function, or any number of other things that basically prevents the structure from actually living in the constant pool and instead requiring copying. EG:

defmodule Constants do
  def slightly_dynamic_mapping() do
    %{hello: "world", name: "Ben", middleware: middleware()}
  end
end

This is no longer a constant because its contents depends on the results of a function call. It doesn't matter if middleware() itself only returns a constant value here, since the compiler is having to work with basically the AST of the map, not its runtime values, and there's no way to tell from the AST middleware() that it's also a constant value at runtime.

The magic of :persistent_term however is that you can take the output of result = Constants.slightly_dynamic_mapping() and :persistent_term.put(:some_key, result) and now that value is put INTO the constant pool. When you do :persistent_term.get(:some_key) you get that nice pointer into the pool instead of having to copy the map.

So to put it all together, the :persistent_term backend in Absinthe saves memory because it all but guarantees that all of the schema parts you have to fetch to handle a request come from the constant pool, whereas the old school macro constants approach tends to result in more and more unavoidable copying as schemas grow and become complex.

absinthe-graphql / absinthe