studio-b12 / gowebdav

A golang WebDAV client library and command line tool.
BSD 3-Clause "New" or "Revised" License
309 stars 89 forks source link

memleak during multiple file upload when authentication is required #24

Closed ilyalavrinov closed 1 year ago

ilyalavrinov commented 6 years ago

Hello Collaborators,

Describe the bug A short description of what you think the bug is.

Software

To Reproduce I've started to build a small tool which would allow file syncing via WebDAV. It's not fully ready but at least it allows to scan a folder and create a list of files which should be uploaded. The flow of the program now is the following:

  1. Create list of files for uploading
  2. Create N goroutines, each has its own gowebdav.Client
  3. Filenames for uploading are sent to these goroutines via channels (fan-out), then each file is uploaded independently via Client.WriteStream. The file is closed via defer.
  4. I used WebDAV API for Yandex.Disk, but I believe the same problem could be reproduced with any server with configured Basic authentication.

What I observe is enormous RAM usage by my tool (e.g. even if a folder contains files for 1G, it can easily take up to several gigabytes of RAM). I've tried to narrow the scope of the problem and here's what I've got:

I believe there's something wrong with current usage of TeeReader introduced in #b45378c. It looks like persistConn inside of a Transport somehow doesn't allow to collect the unused objects (i.e even if a new file gets uploaded, the data for the old one doesn't get cleared). Unfortunately I'm pretty new to Go and couldn't resolve the issue by myself.

Expected Well, no memleak

Additional context Just in case you're curious how do I use gowebdav exactly, here is the tool I'm using.

Please let me know if you need any additional info.

chripo commented 6 years ago

thanks for reporting. i'll take a look.

chripo commented 6 years ago

i took a look inside your code and ours. everything seems to be fine, or i can't spot the error atm.

we don't use a global state or caching or something else, therefore GO should collect all memory. and it does it, but in it's on own way.

if some memory gets collected it's still available to the the application, so the application can use it again without acquire more from the system, which is much cheaper.

you can force it by calling FreeOsMemory

ilyalavrinov commented 6 years ago

Thanks! I'll take a look at "runtime/debug" to dig deeper into this issue and probably collect some more data. However it still seems a bit strange that:

Anyway, I'll play a bit with tools from "runtime/debug", double-check that the issue is present in the latest version of gowebdav and maybe narrow down the issue with some simple example + test config for local nginx/apache. I'll come back in a few days if I get any results.

chripo commented 6 years ago

Yes the memory consumption should be doubled since #14 because of teeing the body, which sucks badly. We still need to refactor this case and do some strategy for servers which prefere the auth in this way, see #15.

defere bb.Reset() may speed up GC.

chripo commented 6 years ago

@admirallarimda did you took a deeper look?

ilyalavrinov commented 6 years ago

Unfortunately, not yet. I'll definitely try it out some day, but currently I cannot predict when I would have time for this. If you think everything looks fine, feel free to close the issue. Once I dig deeper into the issue and prove that there's a real problem, I'll reopen it or send a PR if I manage to fix it somehow.

ochanism commented 6 years ago

https://github.com/studio-b12/gowebdav/blob/425530b55ef4712a38177d133693c7601824c8dd/requests.go#L16

TeeReader requires as much memory as the requested file size. This code needs to be improved to avoid out of memory when uploading large-size files.

fghezzo commented 4 years ago

I also found the memory leak reported in this issue. As a solution, after this line: https://github.com/studio-b12/gowebdav/blob/321978fa735d5f8c3eaeca2c705a1ada7d34abbe/requests.go#L45 I've read the body of the response as it wasn't collected by GC. This would resolve the memory leak and also a not reported goroutine leak.

_, err = ioutil.ReadAll(rs.Body)
if err != nil {
    return nil, err
}