tsgrp / OpenContent

TSG's Web Services for ECM Repositories
8 stars 4 forks source link

RESTContent Needs Buffered Reading of InputStream To Avoid Memory Overuse #21

Open mbowen000 opened 10 years ago

mbowen000 commented 10 years ago

It appears that our getContent() calls are correctly setting a DataHandler as the content property on the EnhancedObjectContent objects - but any time we're taking the content from the InputStream returned from this DataHandler we need to ensure that we do not blindly copy the bytes from InputStream to OutputStream (like in RESTContent). This inflates the memory usage at the time of copy to the size of the content, causing heap memory overflows.

Instead, we should look at using Apache IOUtils to get a buffered InputStream from the source and do a looped buffered read of the content to the server response's output stream. The following pseudo-code demonstrates the concept:

Open input stream
Open output stream
create byte buffer
while (read stuff into byte buffer) {
    write byte buffer to output stream
}

This way, you're really only using the size of the byte[] buffer at a time (per thread) which will drastically reduce memory usage.

This is just one bug of many. I will attach more here: [ Attach related bug here ]

gsteimer commented 10 years ago

@mbowen000 - Please add a milestone and assignee if this is actually high priority