Open joshuataylor opened 3 years ago
Thank you for the report.
I believe I've confirmed what you said
iex(15)> IEx.pry
eheap_alloc: Cannot allocate 47021480096 bytes of memory (of type "heap_frag").
Do you believe that this is a flaw in Meeseeks, a flaw in pry
, or an incompatibility between the two?
To confirm, you aren't having any problem parsing that file with Meeseeks, right? You can parse it ok and select all the pages, it's just when you try to pry
that there's an issue? I see you mention a problem with Flow, too, can you describe what that problem is?
To share what I saw
:erlang.memory()
and saw a total usage of 34485976
(bytes)f = File.read!(...)
:erlang.memory()
and saw a total usage of 57654072
Meeseeks.Document
with d = Meeseeks.parse(f, :xml)
:erlang.memory()
and saw a total usage of 107529424
r = Meeseeks.all(d, xpath("//page"))
(found 2235 matches):erlang.memory()
and saw a total usage of 117366464
So up to this point parsing the XML document produces a Meeseeks.Document
that appears to use about 2.5x the space of the raw string, and another 10MB or so to get the matches. This is a lot of memory, but not unexpected (see #31) since I have to capture the whole of the size of the XML string plus a bunch of maps, parent information, etc, in the document.
Then I Call IEx.pry
and the CPU spins up to max and eventually the BEAM shuts down since I don't have 48GB of memory on hand for it. I am not sure what pry
is doing but it seems wild to take 117MB total usage and turn it into 48GB of usage.
Further exploration suggests that if I call IEx.pry
after I parse the document (step 4 above) it works fine, and takes the total memory usage from 107190608
to 156387424
(it's adding almost 50% more memory usage).
Meeseeks.Result
s (returned by Meeseeks.all
) contain the Meeseeks.Document
for which the match is true, but since they all point to the same immutable value (the original parsed document) I don't think there is any additional memory used. Perhaps pry
is doing something that causes each Meeseeks.Document
in each match (all 2235 of them) to take up its own memory? Perhaps it's doing something else for each of those matches that causes them to take up significantly more memory? I'm not sure.
Maybe @josevalim has some idea of what pry
is doing here?
Pry is most likely starting a separate process and that is causing excessive copying? Does the issue happen if you spawn a task and then access the meeseeks document inside? Flow would have the same issue with copying too.
Yep, that's it. Running a Task.async(fn -> Enum.count(r) end)
after step 7 above causes the same "cannot allocate" crash. Looks like when copying the document in each result they each use their own memory (or something similar).
Yes, you are correct. Here is a simple test I wrote:
IO.inspect :erlang.memory()
data = List.duplicate('a', 1_000_000)
IO.inspect :erlang.memory()
list = List.duplicate(data, 100)
Task.await(Task.async fn ->
List.flatten(list)
IO.inspect :erlang.memory
:ok
end)
You could ask the Erlang team why each reference is copied. Perhaps it is expensive to track references like that, but it is certainly the root cause.
You can fix this by changing Meeseks internal structure. It seems you are referencing the same document over and over again, and some indirection would address this. For example, you could refer to a number and have a map from number => document
.
Alternatively, you can work with binaries, large binaries will be referenced instead of copied. If you replace data = List.duplicate('a', 1_000_000)
in my example to data = String.duplicate("a", 1_000_000)
, then it is super fast.
Thank you! I was just in the process of writing a similar List.duplicate
test to try and understand what was going on.
That's actually a bit surprising- I incorrectly assumed that if it were going to a single process a value would just be copied once and then be referenced, maintaining a similar memory usage pattern to the original process.
I definitely want to figure out how to address the problem, but I don't know how I can do that via a binary, since parsing the HTML/XML in the first place is the most expensive part and I'm not sure how I could represent the document as a binary without having an expensive time dehydrating/rehydrating.
This will definitely require some thought.
I have always been resistant to returning the document from Rustler as a NIF resource instead of as Elixir data, since I appreciated the freedom that came from being able to see and manipulate the data directly from Elixir, but I guess moving to a NIF resource might be one potential solution because I think they're treated similarly to binaries and not copied.
Wow! Thanks so much for the feedback and rapid investigation!
Given this is an Erlang issue, should this be documented as something that might happen if you pass to another process, and then it can be figured out? I'm not sure what the best solution for this would be, in the mean time what I'm doing is moving the XML parsing into a step, then moving that output to flow (as you should be doing :-)).
I definitely need to document the issue and provide warnings in the appropriate places.
It's a pretty bad problem, but I'm not entirely surprised that it took so long to surface since you need to be parsing a pretty big document, running a selection that returns a lot of matches, and then moving those matches to another process rather than extracting them and moving the result of the extractions to different processes.
The BEAM's behavior here surprises me, but I'm not sure I'd call it an Erlang issue- I'm not sure why they designed it that way but there is probably a reason. Really it's an issue with my design that a certain arrangement of the basic actions one might take can balloon memory so dramatically, even if that arrangement is fairly uncommon, and so it's on me to fix the design. Somehow.
Elixir: 1.12.1 Erlang: 24.0.2
When using Flow or IEx.pry, Elixir will eat all my memory and crash.
Stream.map
andEnum.map
work as normal.Repo example:
Now change the file
lib/meeseeks_xml.ex
with the path to the XML file.Then run:
:skull: The next step will cause Elixir to eat all your memory and might cause your computer to crash, so be careful; :skull:
Type the following into iex: