DAddYE / vips

Go Bindings for Vips (a super fast image processor)
454 stars 52 forks source link

Memory problems in web application #18

Closed JorgeRdg closed 9 years ago

JorgeRdg commented 9 years ago

I'm creating a very simple webserver in Go responsible for resizing images. However, I found out that after some time running (or under heavy loads), the server ran out of memory. After some testing, I realized something curious that only happened when a resize was done (some requests do not trigger the resize process).

After each resize, the memory usage of the Go applications was significantly increased (don't know if normal, but definitely reasonable considering that I'm working with in-memory images). The problem is that this memory was never released. Thus, after some time, it finally ate all the system memory until crashing. I guess this is a bug as, actually, changing to gographics/imagick, although slower and more difficult, has finished these memory problems.

This is the part of my code where the vips library is used:

func transform_image(image []byte, dimension int) ([]byte, error) {
    options := vips.Options{
        Width:          dimension,
        Height:         dimension,
        Interpolator:       vips.BILINEAR,
        Quality:        95,
    }

    return vips.Resize(image, options)
}

With the result (after checking for errors), I perform a writer.Write(image).

Any idea of why this could happen or solutions? Am I doing something wrong? Maybe the library is not suited for long-standing processes such as web applications?

Thanks in advance!

DAddYE commented 9 years ago

Thanks for this! I'm going to work on it and fix it as soon as possible!

h2non commented 9 years ago

That points to a memory leak issue... probably some buffers are not free from memory in case of unexpected error.

BTW, I don't know how much it's required to listen for the postclose libvips signal to properly clean a reference of the initial image buffer. @lovell does it in sharp

lovell commented 9 years ago

@JorgeRdg Is this the golang heap that's constantly increasing, as shown by two (or more) heapdumps?

If not, and the heap is constant-ish but RSS is increasing, valgrind should hopefully allow you to detect possible C leaks occurring underneath the golang runtime. By way of example, leak.sh is a shell script to leak test sharp's functional test suite.

There's some talk of having to manually call runtime.FreeOSMemory at https://groups.google.com/forum/#!topic/golang-nuts/Qq1coj750ZI

lovell commented 9 years ago

@h2non sharp has to do a memcpy of the input JavaScript Buffer object to avoid the situation where V8 memory compaction moves that Buffer whilst the task is queued. The postclose signal is emitted by libvips when it no longer needs the input buffer, which sharp then uses to delete its copy. The original JavaScript Buffer will be garbage collected by V8 after it falls out of scope.

I'm still learning about golang so am unsure how much of this will be necessary. http://stackoverflow.com/questions/7823725/what-kind-of-garbage-collection-does-go-use suggests that the garbage collection implementation is in a state of flux. If memory compaction is used then a similar solution to sharp may be required.

h2non commented 9 years ago

Thank you @lovell

I'm still learning Go internals also so I'm not sure as well about that. Probably it's not required in Go since the GC should do all the work for us. The unique thing which requires special attention is memory allocation at C-level, and probably the issue could be close there...

JorgeRdg commented 9 years ago

@lovell it's the RSS the one increasing, and yes it really looks like C leaks.

At the beginning I also thought the FreeOSMemory call could come in handy, but when I swapped to use another library (trying to keep the code as similar as possible) the behaviour changed completely and everything started to fit my brain model. So I still think there's something leaking memory within the package.

Of course I might be wrong as I'm fairly new to Go, but my nuts tell me so ;)

On Tue, Apr 7, 2015 at 8:28 PM, Lovell Fuller notifications@github.com wrote:

@JorgeRdg https://github.com/JorgeRdg Is this the golang heap that's constantly increasing, as shown by two (or more) heapdumps?

If not, and the heap is constant-ish but RSS is increasing, valgrind should hopefully allow you to detect possible C leaks occurring underneath the golang runtime. By way of example, leak.sh https://github.com/lovell/sharp/blob/master/test/leak/leak.sh is a shell script to leak test sharp's functional test suite.

There's some talk of having to manually call runtime.FreeOSMemory at https://groups.google.com/forum/#!topic/golang-nuts/Qq1coj750ZI

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-90706846.

h2non commented 9 years ago

Take a look to: https://golang.org/pkg/net/http/pprof I'll make some tests soon in a very similar scenario using a HTTP service

lovell commented 9 years ago

Scanning through the code, I can't see where interpolator created at https://github.com/DAddYE/vips/blob/master/vips.go#L234 is unreferenced, so that's a possible leak source.

h2non commented 9 years ago

That's probably the memory leak. Adding the following statement should fix it:

C.g_object_unref(C.gpointer(interpolator))
DAddYE commented 9 years ago

Thanks @lovell for spotting it. I missed it while merging a pr.

Here the fix: https://github.com/DAddYE/vips/commit/8efc2f473edd209b4eb5561ce40d32cf7b86639f

@JorgeRdg can you retest it now?

JorgeRdg commented 9 years ago

I've retested it and I'm so sorry to say this is still happening :(

I attach some screenshots with 'htop'. #1 is after the big amount of requests is reached (you can see memory raises until 784 MB), and in #2 the memory is 740 MB after about 15 minutes have passed (look at the 'uptime' timer at the top right). Please note that there has been no requests at all in those 15 minutes and server is completely idling.

There are two things that worry me: 1) Obviously, there's (almost) no memory freed. 2) Even if memory would be released after waiting some time, each request is making the goserver increase too much in memory usage (especially now that I can compare it with the memory usage of imagegraphick, which is constantly under ~100MB).

The combination of (1) and (2) makes me think that there's something else leaking memory in here.

PS: I tested that the package was correctly updated with last changes.

On Wed, Apr 8, 2015 at 5:19 PM, DAddYE notifications@github.com wrote:

Thanks @lovell https://github.com/lovell for spotting it. I missed it while merging a pr.

Here the fix: 8efc2f4 https://github.com/DAddYE/vips/commit/8efc2f473edd209b4eb5561ce40d32cf7b86639f

@JorgeRdg https://github.com/JorgeRdg can you retest it now?

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-90964419.

JorgeRdg commented 9 years ago

Epic fail with the images XD Ok, here they are:

(1) http://i.imgur.com/EqzJTKw.png (2) http://i.imgur.com/SnfNcGD.png

h2non commented 9 years ago

Definitively there's a leak somewhere... Go GC cycles are usually under 1 second.

What about if you can perform some progressive batch tests and profiling the memory live cycle during the process? This method might help to detect in which phase the memory start leaking.

Go has an exceptional interface for memory profiling and GC analysis. I've just wrote a pretty simple code which might help with that (remember to go get the dependencies). You could eventually call to PrintMemoryStats() https://gist.github.com/h2non/1bc21c5859f1191b5fa9

Additionally, you could use as well debug.FreeOSMemory() to force a GC cycle. This might evidence where is the leak as well (however the memory allocated at C-level is not over the control of the GC)

h2non commented 9 years ago

@JorgeRdg How many requests and image size are you processing to increase the memory in that way?

JorgeRdg commented 9 years ago

I'll try debugging some time later. Thanks for the tools!

It happens with all the requests I do, but to give you an example, my last try was with a 577x2000 JPG downsized to 433x1500 (original image is 370 KB).

My [superb] method of testing is just repeatedly pressing ctrl+f5 in Chrome to my localhost ;)

On Wed, Apr 8, 2015 at 6:55 PM, Tomás Aparicio notifications@github.com wrote:

@JorgeRdg https://github.com/JorgeRdg How many requests and image size are you processing to increase the memory in that way?

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-90987836.

lovell commented 9 years ago

@JorgeRdg Which version of libvips are you using? There are a few known leaks before 7.42.2.

JorgeRdg commented 9 years ago

vips 7.38.5! Whenever I have time I'll try updating the version ;)

On Wed, Apr 8, 2015 at 7:25 PM, Lovell Fuller notifications@github.com wrote:

@JorgeRdg https://github.com/JorgeRdg Which version of libvips are you using? There are a few known leaks before 7.42.2.

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-90995390.

h2non commented 9 years ago

@JorgeRdg @lovell I've just made some benchmark tests with memory profiling and there's a memory leak definitely at C-level. The virtual memory is increasing exponentially. I'm using 7.42.3

According to Go memory stats, the allocated memory in the stack is almost always lower than 250KB and the heap memory at Go system process is always under 5MB, and the sum of the total allocated memory in the whole process during the time is the same as the sum of the virtual memory required by the process (>100MB in my case).

Go runtime is considering that memory as has been already freed, but it remains in the virtual memory of the whole process at OS-level, therefore the leak is outside of the Go GC scope.

Any idea about possible leaks?

lovell commented 9 years ago

@h2non Up to 100MB is used for libvips' operation cache.

h2non commented 9 years ago

@lovell I see, thanks. I'm actually getting around 170MB of virtual memory usage in the whole process while running the same benchmark and using more large images than before. I'll perform some other exploratory tests to be sure about the leaks.

JorgeRdg commented 9 years ago

Anyway, as far as I know, the important thing in here is not the virtual memory but the resident one. The 'htop' command line tool can help you to debug it in real time.

On Wed, Apr 8, 2015 at 8:14 PM, Tomás Aparicio notifications@github.com wrote:

@lovell https://github.com/lovell I see, thanks. I'm actually getting around 170MB of virtual memory usage in the whole process while running the same benchmark and using more large images than before. I'll make some other exploratory tests to be sure about the leaks.

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-91008266.

DAddYE commented 9 years ago

Anyway, as far as I know, the important thing in here is not the virtual

Yes, go allocates as much virtual memory it can

h2non commented 9 years ago

Virtual memory points to the whole memory which a OS process allocates, so it's agnostic to the physical place where the data was placed (cache, ram, swap...)

My process is growing up to 170MB after running a lot of tests, and 30 seconds after all benchmarks was completed the process still has the same memory. My unique concern is why. I saw that some Go programs usually tend to take a lot of memory on expensive tasks, but I don't think it should remain the memory for a long time when it doesn't requires more computations, like in this case...

h2non commented 9 years ago

@JorgeRdg You're right. According to the top man page:

VIRT  --  Virtual Image (kb)
      The  total  amount  of  virtual  memory  used  by the task.  It
      includes all code, data and shared libraries  plus  pages  that
      have been swapped out.

      VIRT = SWAP + RES.

Probably in that case only the RES memory makes more sense

DAddYE commented 9 years ago

Can we close this issue or do we still think there is a leak?

JorgeRdg commented 9 years ago

I've tried updating to last version (which is now vips 8.0.0) and it still has the same problems. RSS memory stays very high.

On Fri, Apr 10, 2015 at 5:06 PM, DAddYE notifications@github.com wrote:

Can we close this issue or do we still think there is a leak?

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-91602741.

h2non commented 9 years ago

@JorgeRdg I've created a imaginary for my personal needs, which is essentially a simple HTTP interface for libvips, so it's similar to your current Go server.

You could use it and perform the same tests in order to reproduce the same issue in other implementation. This may help to discover where's the real issue.

go get github.com/h2non/imaginary

Run (assuming $GOPATH/bin is in your $PATH)

imaginary -p 8088 

Finally, click here and the most pretty HTML form you ever seen to test it.

PS: benchmark tests still pending. Will do asap

DAddYE commented 9 years ago

@JorgeRdg I sent a possible fix, can you test it now?

Thanks!

h2non commented 9 years ago

@JorgeRdg You could use vegeta to perform the benchmark and share the result with us. It's pretty simple:

echo "POST http://localhost:8080/resize?width=300" | vegeta attack \
  -duration=10s \
  -rate=30 \
  -body="image.jpg" \ | vegeta report
JorgeRdg commented 9 years ago

I've just tested it with Imaginary and I can see that the memory is still raising by a lot (707MB with 300 req / 10s), and it keeps that way for 10 minutes. You don't have the same memory results?

Benchmark output: Requests [total] 300 Duration [total, attack, wait] 10.030735102s, 9.966635435s, 64.099667ms Latencies [mean, 50, 95, 99, max] 57.529297ms, 56.987136ms, 63.40792ms, 69.185314ms, 71.448661ms Bytes In [total, mean] 14070300, 46901.00 Bytes Out [total, mean] 177922200, 593074.00 Success [ratio] 100.00% Status Codes [code:count] 200:300

This is using last update by @DAddYE.

On Mon, Apr 13, 2015 at 9:04 AM, Tomás Aparicio notifications@github.com wrote:

@JorgeRdg https://github.com/JorgeRdg You could use vegeta https://github.com/tsenart/vegeta to perform the benchmark and share the result with us. It's pretty simple:

echo "POST http://localhost:8080/resize?width=300" | vegeta attack \ -duration=10s \ -rate=30 \ -body="image.jpg" \ | vegeta report

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-92260577.

h2non commented 9 years ago

As far I've seen and as @DAddYE said as well, Go don't hesitate in taking a lot of memory to deal with a high demand properly, specially in HTTP services since each request creates goroutine. I've seen also that behavior in other stress tests with Go based HTTP services.

I'm my tests, increasing the concurrency to 100 instead of 30 grows the total process memory up to 1.2GB.

You could try to analyze the libvips memory:

int64(C.vips_tracked_get_mem())
int64(C.vips_tracked_get_mem_highwater())
int64(C.vips_tracked_get_allocs())
JorgeRdg commented 9 years ago

That's understandable and it's really not the problem in here. The problem is that that memory is never released. Thus, after some time, the application runs out of memory. Maybe in some scenarios this is not a problem, but for a web server it definitely is.

On Tue, Apr 14, 2015 at 12:08 PM, Tomás Aparicio notifications@github.com wrote:

As far I've seen and as @DAddYE https://github.com/DAddYE said as well, Go don't hesitate in taking a lot of memory to deal with a high demand properly, specially in HTTP services since each request creates goroutine. I've seen also that behavior in other stress tests with Go based HTTP services.

I'm my tests, increasing the concurrency to 100 instead of 30 grows the total process memory up to 1.2GB.

You could try to analyze the libvips memory:

int64(C.vips_tracked_get_mem())int64(C.vips_tracked_get_mem_highwater())int64(C.vips_tracked_get_allocs())

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-92771913.

h2non commented 9 years ago

Yeah, absolutely! Currently I can crash the server with a panic if the HTTP load is really heavy

DAddYE commented 9 years ago

@h2non, @JorgeRdg long running http connections requires manual free of memory, try to add something like: https://github.com/goniltalk/niltalk/blob/master/niltalk.go#L68-L80

Meanwhile I made a test: https://github.com/DAddYE/vips/commit/6ff64b804f0e04e4156ceea8c16d56175fb18c23

You can try it with:

cd testdata
GOMAXPROCS=8 go run speed.go

Here my screen before: Before

After: After

Unfortunately Valgrind doesn't work with go.

Last question how did you install libvips?

h2non commented 9 years ago

Thank you @DAddYE for the information and test. This may help as well as useful information about that: http://stackoverflow.com/questions/24376817/go-1-3-garbage-collector-not-releasing-server-memory-back-to-system

JorgeRdg commented 9 years ago

I don't have time to test it right now, I'll try to do it this afternoon.

I installed libvips from source, following this repo instructions on readme.

On Tue, Apr 14, 2015 at 10:47 PM, Tomás Aparicio notifications@github.com wrote:

Thank you @DAddYE https://github.com/DAddYE for the information and test. This may help as well as useful information (specially the first answer)

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-93077841.

DAddYE commented 9 years ago

Remember to use the flags posted in this read me. Those are optimized for golang and high concurrency

Sent from my iPhone

On Apr 15, 2015, at 2:44 AM, Jorge Rodríguez notifications@github.com wrote:

I don't have time to test it right now, I'll try to do it this afternoon.

I installed libvips from source, following this repo instructions on readme.

On Tue, Apr 14, 2015 at 10:47 PM, Tomás Aparicio notifications@github.com wrote:

Thank you @DAddYE https://github.com/DAddYE for the information and test. This may help as well as useful information (specially the first answer)

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-93077841.

— Reply to this email directly or view it on GitHub.

h2non commented 9 years ago

I think that doesn't makes too much sense since the libvips concurrency level is restricted to 1. https://github.com/DAddYE/vips/blob/master/vips.go#L88

DAddYE commented 9 years ago

@h2non yes you're right I confused it with opencv/trez (my other project)

DAddYE commented 9 years ago

Any update on this?

JorgeRdg commented 9 years ago

I'm sorry of no updates, but I haven't been able to test any new things and I expect not being able to for a while :(

On Thu, Apr 23, 2015 at 6:38 PM, DAddYE notifications@github.com wrote:

Any update on this?

— Reply to this email directly or view it on GitHub https://github.com/DAddYE/vips/issues/18#issuecomment-95666609.

h2non commented 9 years ago

I'm doing some real tests in a similar scenario, and then I'll share the conclusions here.

h2non commented 9 years ago

Just a brief notes to throw some light to this topic:

These days I'm running a system migration of a huge bunch of images, and to do that I'm using a HTTP service written in Go and backed by libvips to do a sort of batch processing to convert and resize thousands of images. The service is running on Heroku in a basic dyno with 512MB of RAM.

During the process, I'm monitoring the OS memory and the RSS is stable (usually under 150MB), and it was freed when necessary along the time. Indeed right now, minutes after running ~500 image resizes the RSS decreased to 87MB.

It worth to mention that the server only accepts up to 10 concurrent requests, however you should take into consideration some scenarios that may cause an unexpected increase of the memory, such as implementing a kind of throttle strategy to deal properly with high number concurrent requests (for instance, if your HTTP server is publicly available, you probably must do that to guarantee a minimum of quality of service)

DAddYE commented 9 years ago

Awesome @h2non, really appreciated!

Closing for now.

JorgeRdg commented 9 years ago

For future reference, please note that for long-running processes, DAddYE's solution at https://github.com/DAddYE/vips/issues/18#issuecomment-93065243 is required to free memory.

Thanks for the help everyone!