Closed GoogleCodeExporter closed 9 years ago
When you say it's using a lot of memory, do you mean virtual memory or physical
memory?
Has there been any actual problem due to the memory growing? Or just something
you're concerned about?
The best way to figure out what's going on is to put in a call to
MallocExtension::instance()->GetStats(), and print out the resulting buffer.
This will tell us where tcmalloc thinks the memory is.
Original comment by csilv...@gmail.com
on 11 Oct 2010 at 8:21
Let me explain a little more.
We have multiple instances of the same process running on the Linux server. Our
server has 32 GB RAM.
Each process in turn reads large files of GB size. When we read these large
files, we anyway cannot avoid the huge memory allocation. So the process ends
up allocation around 1 GB physical memory and 2 GB of virtual memory (a typical
example)
Once we are done with the necessary operation, this process doesn't really have
to occupy this much memory..
We need this because we have other processes that need the physical memory /
virtual memory. Not all processes are active at the same time.. But we expect
the process to give up memory once it is done so that memory can be used by
other processes.
We have typically have anywhere between 6 to 8 processes.. each can consume
average of 3 GB per read loop.
What we have observed is that due to GPT, we are not able to give up memory and
the process starts using more memory.. So 3GB becomes 6 and higher and thereby
the other processes just eat up the swap and hang our systems sometimes.
I will run GetStats() and get an output for further discussion.
Sundari.
Original comment by sunda...@gmail.com
on 12 Oct 2010 at 2:31
Most of the bytes are in unmapped page heap after ReleaseFreeMemory call. Given
below is the output of GetStats() before and after ReleaseFreeMemory call.
Initial Heap Size : 47448064
Stats : ------------------------------------------------
MALLOC: 47448064 ( 45.2 MB) Heap size
MALLOC: 2246192 ( 2.1 MB) Bytes in use by application
MALLOC: 32825344 ( 31.3 MB) Bytes free in page heap
MALLOC: 10260480 ( 9.8 MB) Bytes unmapped in page heap
MALLOC: 939456 ( 0.9 MB) Bytes free in central cache
MALLOC: 0 ( 0.0 MB) Bytes free in transfer cache
MALLOC: 1176592 ( 1.1 MB) Bytes free in thread caches
MALLOC: 998 Spans in use
MALLOC: 1 Thread heaps in use
MALLOC: 5373952 ( 5.1 MB) Metadata allocated
MALLOC: 4096 Tcmalloc page size
------------------------------------------------
Stats After ReleaseFreeMemory: ------------------------------------------------
MALLOC: 47448064 ( 45.2 MB) Heap size
MALLOC: 2246192 ( 2.1 MB) Bytes in use by application
MALLOC: 0 ( 0.0 MB) Bytes free in page heap
MALLOC: 43085824 ( 41.1 MB) Bytes unmapped in page heap
MALLOC: 939456 ( 0.9 MB) Bytes free in central cache
MALLOC: 0 ( 0.0 MB) Bytes free in transfer cache
MALLOC: 1176592 ( 1.1 MB) Bytes free in thread caches
MALLOC: 997 Spans in use
MALLOC: 1 Thread heaps in use
MALLOC: 5373952 ( 5.1 MB) Metadata allocated
MALLOC: 4096 Tcmalloc page size
------------------------------------------------
Heap Size After ReleaseFreeMemory: 47448064
Original comment by sunda...@gmail.com
on 12 Oct 2010 at 2:54
Using lots of virtual memory shouldn't be a problem: I think you have 48 bits
of virtual memory on an x86_64 system? Or maybe 47? That's plenty.
Physical memory could be an issue. However, GetStats is showing that the heap
size is the same before and after ReleaseFreeMemory. So it looksl ike it's
doing what it ought to. tcmalloc thinks the app has less than 100 M of memory
mapped. Is top (or ps, or whatever you're using) showing more?
One issue could be overhead due to sampling. In tcamlloc 1.6, I've changed the
default to not sample at all by default. Do you want to try upgrading to
tcmalloc 1.6, and see if this fixes the problems you're seeing? You could also
just try running with the environment variable TCMALLOC_SAMPLE_PARAMETER=0.
Original comment by csilv...@gmail.com
on 12 Oct 2010 at 10:12
Thanks for the analysis. I am using version 1.6. Whatever I shared is from 1.6
and my problem is I want to release the heap because I don't need it in the
process!
Sundari.
Original comment by sunda...@gmail.com
on 13 Oct 2010 at 4:54
Unmapped bytes *are* released. We'll clean up this wording in the next release
to make it clearer -- I admit it's really confusing right now. These are bytes
that have been released to the OS (via an madvise call). The stats you're
showing me indicate everything is working like it should.
Just to be clear, when you said you were using tcmalloc 1.5 at the top of this
bug report, that was a typo? You're actually using 1.6?
} What we have observed is that due to GPT, we are not able to give up memory
and
} the process starts using more memory.. So 3GB becomes 6 and higher and
thereby the
} other processes just eat up the swap and hang our systems sometimes.
Are you certain that's what's happening: processes are swapping because of the
memory demands of the binaries? Or is that just a hypothesis you have right
now? I want to be clear, because from what I'm seeing, that shouldn't be
happening.
Original comment by csilv...@gmail.com
on 13 Oct 2010 at 5:27
One possibility is the madvise() is failing for you, so the bytes aren't
actually being returned to the system properly. You can test this by looking
at src/system_alloc.cc, at the madvise call. Right now we ignore return
values, but you can look if it's -1 (and not EAGAIN), and maybe print something
out then. If you see that printout when you're running, then that's an
interesting tidbit.
Original comment by csilv...@gmail.com
on 13 Oct 2010 at 5:48
Answering the thread for both your responses.
Our software is using GPT 1.5 but I happened to create a utility to simulate
this issue. The utility uses 1.6. I forgot to mention that in my posts. Sorry
about it.
Today we did some more characterization using the 1.6 version and the utility.
We verified that madvise() call is indeed returning 0. That shows there are no
failures.
We also profiled a real large data allocation and free.
We were always looking at the HEAP SIZE value of GPT and did not pay attention
to top output. Today we tried to correlate both.
Here is what we observed. HEAP SIZE we see in GPT is almost near VIRT value
reported in top for the process.
Our problem is we see a LARGE value of VIRTUAL memory the process is holding on
to after release call.. Is there a way to free that?
Given below is the data :
Memory allocated in the process :
Top output :
-------------------------------
Virt : 1240m
RES : 1.1g
GPT output :
--------------------------------
Heap Size : 1267204096
Stats : ------------------------------------------------
MALLOC: 1267204096 ( 1208.5 MB) Heap size
MALLOC: 2263680 ( 2.2 MB) Bytes in use by application
MALLOC: 1260503040 ( 1202.1 MB) Bytes free in page heap
MALLOC: 0 ( 0.0 MB) Bytes unmapped in page heap
MALLOC: 1777056 ( 1.7 MB) Bytes free in central cache
MALLOC: 62464 ( 0.1 MB) Bytes free in transfer cache
MALLOC: 2597856 ( 2.5 MB) Bytes free in thread caches
MALLOC: 2205 Spans in use
MALLOC: 1 Thread heaps in use
MALLOC: 11141120 ( 10.6 MB) Metadata allocated
MALLOC: 4096 Tcmalloc page size
------------------------------------------------
AFTER CALLING RELEASEFREEMEMORY CALL
TOP OUTPUT
------------------------------------------------
REST : 19m
Virt : 1240m
MALLOC: 1267204096 ( 1208.5 MB) Heap size
MALLOC: 2263680 ( 2.2 MB) Bytes in use by application
MALLOC: 0 ( 0.0 MB) Bytes free in page heap
MALLOC: 1260503040 ( 1202.1 MB) Bytes unmapped in page heap
MALLOC: 1777056 ( 1.7 MB) Bytes free in central cache
MALLOC: 62464 ( 0.1 MB) Bytes free in transfer cache
MALLOC: 2597856 ( 2.5 MB) Bytes free in thread caches
MALLOC: 2205 Spans in use
MALLOC: 1 Thread heaps in use
MALLOC: 11141120 ( 10.6 MB) Metadata allocated
MALLOC: 4096 Tcmalloc page size
We see that RES memory did come down after release call. We also want to
release the 1 GB VIRT memory that is currently being used up by the process.
Regards
Sundari.
Original comment by sunda...@gmail.com
on 13 Oct 2010 at 9:05
OK, sounds like things are working as they ought. Even when we release the
memory back to the system, it stays in our virtual address space for accounting
purposes by the kernel. However, no physical memory is used, and it should not
cause any problems.
Are you actually seeing problems in practice (with tcmalloc 1.6)? Or are you
just seeing these big numbers and being concerned? If you are seeing problems,
what problems are you seeing, precisely?
Original comment by csilv...@gmail.com
on 13 Oct 2010 at 7:20
Thanks Silver. All problems we saw were with GPT 1.4. We have not upgraded to
GPT 1.6 as yet.
We started this exercise because one of our Linux servers, that ran 8 processes
hung because of running out of swap space.
Our Application uses GPT 1.4. And we also don't call ReleaseFreeMemory(). We
started this exercise to see if there are ways to reduce the memory foot print
per process.
Initially we were not even sure where the problem was (whether there were
memory leaks)
As these are production systems, GPT upgrade might not be possible immeidately.
We want to keep the version to 1.4 if possible for this product family.
I ran the same process with GPT 1.4 version, I don't see the mapped and
unmapped bytes after free. GetStats() in GPT 1.4, just reports Free bytes in
heap. But the memory definitely goes down similar to GPT 1.6.
I will introduce the ReleaseFreeMemory() call in our application so that we
give back memory to OS.
One question I still have is bytes remaining in VIRTUAL ADDRESS SPACE. Our
servers have 6 GB of SWAP allocated. They have 32 GB RAM.
If we run 8 processes and each process reserves 1 GB of SWAP space, will we run
out of swap? I would like to understand the implications of this scenario.
ReleaseFreeMemory() seems to solve the issue with respect to physical memory.
Thanks a ton for your immediate response and support! Greatly appreciated!
Regards
Sundari
Original comment by sunda...@gmail.com
on 14 Oct 2010 at 8:37
Just to be clear, virtual memory is not the same as swap. Assuming you're on a
64-bit machine, you have (I think) 64000 gigabytes of virtual memory, so you're
not likely to be running out of it.
The tcmalloc stats report virtual memory use (which is what userspace typically
gets to see). The stuff in 'unmapped in page heap' is definitely *not* taking
physical memory. If you're seeing lots of physical memory being used, it must
be from the other numbers.
} I will introduce the ReleaseFreeMemory() call in our application so that we
give back memory to OS.
That's a good idea. We should emphasize that more in the docs. I'll try to
figure out the right wording.
} As these are production systems, GPT upgrade might not be possible
immeidately. We want to keep the version to 1.4 if possible for this product
family.
That should be fine. You can try setting the environment variable
TCMALLOC_SAMPLE_PARAMETER=0 before running your program, and see if that helps.
Original comment by csilv...@gmail.com
on 14 Oct 2010 at 9:29
Closing this bug -- I don't think tcmalloc is doing anything wrong here. The
wording of the memory-use message has been improved since perftools 1.6, to
make it clearer that virtual memory use isn't causing any problems.
I suspect the sampling is what's really causing issues here, since it doesn't
show up in the tcmalloc memory use output. Since we turn off sampling by
default in the latest perftools, that could be considered resolved now too. :-)
Original comment by csilv...@gmail.com
on 1 Sep 2011 at 1:53
Original issue reported on code.google.com by
sunda...@gmail.com
on 11 Oct 2010 at 10:08