facebook / proxygen

A collection of C++ HTTP libraries including an easy to use HTTP server.
Other
8.04k stars 1.47k forks source link

"Send in chunks" will cache all chunks in memory? #181

Closed corndolly closed 6 years ago

corndolly commented 6 years ago

I send one big file to client in chunks with send() and sendWithEOM() in ResponseBuilder.

After read one chunk in file, i will send it immediately with send().
But i found that the whole file content will be cached in memory after call send().

Is it a bug here?

dddmello commented 6 years ago

Hmm took a look and I see what you are saying:

ResponseBuilder.h:102 appears to chain the chunks. Will get back to you.

suspend0 commented 6 years ago

ResponseBuilder.h#L157 (in send) seems to clear the chain tho'

dddmello commented 6 years ago

Yeah - corndolly would you be able to provide some sample code to make it clearer what behavior you are seeing that is not expected?

dddmello commented 6 years ago

Going to close as haven't heard anything back.

corndolly commented 6 years ago

Sorry to be late, the sample code is like below (I want to implement "download file" with proxygen)

                int32_t fd = ::open(path.c_str(), O_RDONLY);
                if (fd < 0) {
                    LOG(INFO) << "Open file " << path << " failed! error:" << strerror(errno);
                    proxygen::ResponseBuilder(downstream_)
                            .status(400, "Open file " + path + " failed! error:" + strerror(errno))
                            .sendWithEOM();
                } else {
                    uint64_t pos = 0;
                    uint64_t count = 0;
                    uint64_t totalLen = ::lseek(fd, 0, SEEK_END);
                    bool headerSent = false;
                    std::string buffer;
                    while (count <= totalLen) {
                        uint32_t bufferLen = totalLen - pos < FLAGS_chunk_size ? totalLen - pos
                                                                               : FLAGS_chunk_size;
                        if (bufferLen != buffer.size()) {
                            buffer.resize(bufferLen, '\0');
                        }
                        count = ::pread(fd, &buffer[0], bufferLen, pos);
                        if (count == 0) {
                            break;
                        }
                        pos += count;
                        proxygen::ResponseBuilder rpBuilder(downstream_);
                        if (!headerSent) {
                            rpBuilder.status(200, "OK");
                            headerSent = true;
                        }
                        if (pos != totalLen) {
                            /**
                             * TODO: The memory will not be released after call send.
                             * IMO it is one issue for proxygen, will track it with
                             * https://github.com/facebook/proxygen/issues/181
                             * */
                            rpBuilder.body(buffer).send();
                        } else {
                            rpBuilder.body(buffer).sendWithEOM();
                        }
                    }
                    LOG(INFO) << "Download file " << path << " successfully! readsize = "
                              << pos << ", total = " << totalLen;
                    return;
                }
dddmello commented 6 years ago

1) Seems like you are using a new instance of RpBuilder in each iteration of the while loop which I'm not sure you should do as in the future I can see the class being extending to track some state. [Edit: Removed (2) as I saw it wrong.]

suspend0 commented 6 years ago

https://github.com/facebook/proxygen/blob/master/proxygen/httpserver/ResponseBuilder.h#L112

  template <typename T>
  ResponseBuilder& body(T&& t) {
    return body(folly::IOBuf::maybeCopyBuffer(
        folly::to<std::string>(std::forward<T>(t))));
  }

In the code posted by @corndolly buffer is a standard string, and it's passed to ResponseBuilder::body(T&&) -- that method makes a copy of buffer.

In my read of the code, the leak is not in proxygen; it's that std::string buffer is outside the loop, and only appended to. Resetting the buffer should solve you problem:

rpBuilder.body(buffer).send();
buffer.clear();

ResponseBuilder offers non-copy version of body() too (using std::unique_ptr<folly::IOBuf>) should that suit you.

corndolly commented 6 years ago

Thanks! @suspend0 @dddmello I follow the @suspend0 suggestion with std::unique_ptr, and the issue has been fixed now.

corndolly commented 6 years ago

Sorry, i found another issue, i change the code like below, but found the data received is NOT the same with the original data.
If i use folly::IOBuf::copyBuffer instead of folly::IOBuf::wrapBuffer, everything is OK.
send method will send the content synchronously?

                    uint64_t pos = 0;
                    uint64_t count = 0;
                    uint64_t totalLen = ::lseek(fd, 0, SEEK_END);
                    bool headerSent = false;
                    char buffer[FLAGS_chunk_size];
                    while (count <= totalLen) {
                        uint32_t bufferLen = totalLen - pos < FLAGS_chunk_size ? totalLen - pos
                                                                               : FLAGS_chunk_size;
                        count = ::pread(fd, buffer, bufferLen, pos);
                        if (count == 0) {
                            break;
                        }
                        pos += count;
                        proxygen::ResponseBuilder rpBuilder(downstream_);
                        if (!headerSent) {
                            rpBuilder.status(200, "OK");
                            headerSent = true;
                        }
                        if (pos != totalLen) {
                            rpBuilder.body(folly::IOBuf::wrapBuffer(buffer, bufferLen)).send();
                        } else {
                            rpBuilder.body(folly::IOBuf::wrapBuffer(buffer, bufferLen))
                                     .sendWithEOM();
                        }
                    }
                    LOG(INFO) << "Download file " << path << " successfully! readsize = "
                              << pos << ", total = " << totalLen;
                    return;
suspend0 commented 6 years ago

As far as I know, send() is not synchronous. If you look a the impl in ResponseBuilder and the http layer (this is all from memory) it looks like it is, but deeper down when you actually get to the async socket impl it could queue the send request. You must give ownership of the buffer to send()

You could replace char buffer[] with an IOBuf, read directly into the buf and then std::move() it to send(), or you could revert to earlier code, let send(std::string) copy and then you can clear your string. In both cases you're doing allocation each iteration of the loop.

corndolly commented 6 years ago

Got it. Thanks @suspend0 for your explain.