Closed GoogleCodeExporter closed 9 years ago
It would be nice to get that yaml file (if it is possible) to see where it
leaks memory :)
Original comment by alexande...@gmail.com
on 13 Jan 2011 at 8:56
[deleted comment]
It would greatly simplify fixing the problem if you could provide a valid JUnit
test in your clone.
The YAML you have provided has a lot of tags which are not very simple to
re-create.
Original comment by py4fun@gmail.com
on 13 Jan 2011 at 12:50
I really do not have a memory leak: my jvm is set with -Xmx80m and when I
deserialize my yaml file 80mb limit is not respected.
It's difficult for me create a JUnit for various reasons.
I'd like to know if the change I made in StreamReader.java gives better
performances in term of memory and doesn't have side effects
Thanks in advance
P.S.: I'm reorganizing my code and I will give you JUnit test as soon as
possible
Original comment by antonio....@gmail.com
on 14 Jan 2011 at 9:26
I have managed to create a test with a big input file (1.5M) which fails to
load due to OutOfMemory exception. But when I apply your changes in
StreamReader the situation is not changed.
I think we need to see a real testcase where SnakeYAML fails to load a document
but it succeeds with your patch.
Otherwise the solution is only to increase the memory for the JVM.
Original comment by py4fun@gmail.com
on 14 Jan 2011 at 10:02
[deleted comment]
as long as I agree that there are areas to improve on memory consumption I am
not so sure about your results ( I might be wrong also).
Did you consider that actually objects you are loading ARE BIG. You are running
last GC with your object loaded already, why don't you try to put one more
measure. Something like:
result = null;
System.gc();
.... memory results ...
The figures may be different.
Original comment by alexande...@gmail.com
on 20 Jan 2011 at 9:17
The size of the loaded object is considered: it is 4.69MB in size. That amount
is reported by the "total" column. What concerns me most is the last column,
the "recovered" amount: this is the amount that SnakeYaml allocated during
parsing. Understandably, this will be a non-trivial amount, but 100+MB for less
than 1/2 MB of input seems excessive.
I'll look into running a profiler on SnakeYaml today.
Original comment by JordanAn...@gmail.com
on 20 Jan 2011 at 2:49
Addendum: it should also be noted that the check you propose (checking after
abandoning the loaded object) is performed on the next iteration, as that forms
the "initial" value. The test shows a very stable initial value, so it's not
that SnakeYaml is somehow leaking memory, it's just that there is a huge memory
consumption.
Original comment by JordanAn...@gmail.com
on 20 Jan 2011 at 5:15
profiling would be nice. Good you have time for it ;)
But anyway strange - recovered never goes for me
Eclipse 3.6 (and terminal), OSX 1.6.0_22, latest sources from master
1. over 13.300.000 under -Xms32m -Xmx32m
2. over 28.000.000 under -Xms512m -Xmx512m
Original comment by alexande...@gmail.com
on 20 Jan 2011 at 5:17
I'm on a different machine than the original now, and I'm getting vastly
different results with my test code (above). Specifically, I see memory
consumption now as never exceeding about 12MB; although a little high that's
considerably less worrisome.
Profiling with profile4j shows memory oscillating up and down between about 2MB
and 12MB, which isn't unexpected. It's possible that there is something
terribly wrong with the environment I originally posted from.
Original comment by JordanAn...@gmail.com
on 20 Jan 2011 at 6:17
[deleted comment]
use 1.8-SNAPSHOT for now. By default it does not put context in Mark (so in
case of error you will get only line number and position - line content will
not be printed). It consumes less memory and works faster.
As for the lib optimization - hope we will have time to do it. Do not hesitate
to do it yourself ;) Just do not forget to contribute it back that everybody
benefit :)
Original comment by alexande...@gmail.com
on 21 Jan 2011 at 9:16
2 #11 actually profiler shows figures close to 100MB for the total GCed-size
during yaml loading. MemoryStressTest shows ~12MB because there are some GC
going on during loading. If we give 1024m to the java running MemoryStressTest
numbers maybe ~100MB.
Original comment by alexande...@gmail.com
on 22 Jan 2011 at 12:11
Hi, with 1.8-SNAPSHOT I still have the problem of out of memory.
I modified StreamReader.java 1.8-SNAPSHOT and it works fine and now I have a
memory consumption similar to deserialization Xstream.
Even if you have seen that the change does not solve the problem, I had no
problems with "out of memory" so far
Original comment by antonio....@gmail.com
on 27 Jan 2011 at 11:53
Attachments:
Have you tried latest version? StreamReader has been modified there to use less
memory I believe and works a bit faster. I think even snapshot has been
uploaded already. But to be sure just pull latest master and try it.
I do not know what you mean by "memory consumption similar to deserialization
Xstream", and I do not have any figures for xstream, but I think now
StreamReader is not the main memory consumer :)
Original comment by alexande...@gmail.com
on 27 Jan 2011 at 12:09
Ok, I download again 1.8-SNAPSHOT with maven (I first delete the maven
repository for org.snakeyaml) and I'll tell you how it goes
Thanks
Original comment by antonio....@gmail.com
on 27 Jan 2011 at 12:27
for comment#15
I did not quite catch you. When I try to use the attached file it does not even
compile with the latest source. How did you manage to build the JAR to test ?
How do you provide the input ? As String or as InputStream ?
The files you give here do not help because:
1) there is no code which show how you call SnakeYAML
2) your YAML contains a lot of global tags which prevents us from loading the
document
Why do not you create a test case ? Either use a remote clone or attach here a
patch.
Without your commitment we can hardly do anything for the issue.
When looking into the file I do not understand why it should consume less
resources. Can you please try to explain what you achieve with the changes ?
What is better ? Why do you think the memory consumption is improved ?
Original comment by py4fun@gmail.com
on 27 Jan 2011 at 2:38
Sorry, I've just performed pull and I notice that you changed the code, now it
works also for a big file (2 MB)
Thanks
Original comment by antonio....@gmail.com
on 27 Jan 2011 at 3:15
I still did not get how it works when you build it from source but does not
work when you use the latest SNAPSHOT. The latest SNAPSHOT (1.8-SNAPSHOT)
always relates to the source code in the master Mercurial repository.
May we close the issue ?
(please do not forget to remove the remote repository if you do not need it
anymore:
your clone -> administer -> advanced -> delete repository)
Original comment by py4fun@gmail.com
on 27 Jan 2011 at 5:03
maybe some proxy caching thing... who knows...
Original comment by alexande...@gmail.com
on 27 Jan 2011 at 6:40
Due to changes made for issues 79 and 101 SnakeYAML consumes less resources.
It will be delivered in version 1.8
Original comment by py4fun@gmail.com
on 31 Jan 2011 at 9:36
long attachments were deleted
Original comment by aso...@gmail.com
on 1 Mar 2011 at 11:48
Original issue reported on code.google.com by
antonio....@gmail.com
on 13 Jan 2011 at 8:32Attachments: