DaveAKing / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Stop using FilterOutputStream #1651

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
A review of the HashingOutputStream source reveals a couple of corner cases 
that can result in erroneous results or silently-swallowed exceptions:

1) The hashcode of the streamed content is available independent of the stream 
state (open/flushed/closed).  In the case where the stream is wrapped (directly 
or indirectly) with a BufferedOutputStream (or any other stream that buffers 
the content), the HashingOutputStream won't yet have seen anything that is 
buffered, resulting in the hash() method calculating a likely incorrect 
hashcode (the hash would be technically correct, but not what the user expected 
as it isn't based on the full input).

This can be addressed by not allowing the hashcode to be available until after 
close() is called, throwing an exception if accessed otherwise (this uncovered 
a number of subtle errors when we implemented it):

    public HashCode getHashCode()
    {
        checkState( hashCode != null, "Attempted to retrieve hashCode without having closed the stream first" );
        return hashCode;
    }

    @Override
    public void close() throws IOException
    {
        hashCode = hasher.hash();
        super.close();
    }

2) HashingOutputStream inherits from FilterOutputStream, which implements 
close() as:

    public void close() throws IOException {
        try {
          flush();
        } catch (IOException ignored) {
        }
        out.close();
    }

The call to flush() is appropriate here - but the swallowing of any exception 
is not.  In the case of buffered streams, there may be content left to be 
flushed - and the underlying stream may throw an exception processing that 
content (network issue, disk full, ...).

This is the same issue that resulted in Closeables.closeQuietly being 
deprecated (https://code.google.com/p/guava-libraries/issues/detail?id=1118).

IMO, java.io.FilterOutputStream is not safe to use in it's current state - 
we've banished it from our code base, in favor of an alternate superclass that 
behaves correctly (and also clarifies the overriding / delegation of write() 
calls).

Original issue reported on code.google.com by ch...@digitalascent.com on 30 Jan 2014 at 4:22

GoogleCodeExporter commented 9 years ago
Thanks. I'm going to split this into two bugs, one for early calls to hash() 
and one for the use of FilterOuputStream.

Original comment by cpov...@google.com on 30 Jan 2014 at 4:31

GoogleCodeExporter commented 9 years ago
HashingOutputStream, CountingOutputStream, and LittleEndianDataOutputStream use 
it, as does TestOutputStream.

Original comment by cpov...@google.com on 30 Jan 2014 at 4:33

GoogleCodeExporter commented 9 years ago
(I've filed issue 1652 for the early calls to hash().)

Original comment by cpov...@google.com on 30 Jan 2014 at 4:37

GoogleCodeExporter commented 9 years ago
Sounds good, thx.

fyi, there is a JDK bug related to FilterOutptuStream: 
http://bugs.java.com/view_bug.do?bug_id=6335274, marked as wont fix with a weak 
justification:

"...FilterOutputStream.flush() and FOS.close() have not changed in many years.  
Any changes to them now would introduce potentially unpleasant side-effects in 
existing customer code. "

Original comment by ch...@digitalascent.com on 30 Jan 2014 at 4:50

GoogleCodeExporter commented 9 years ago
Thanks. I wonder if we should add our own FilterOutputStream (which we'd 
probably call ForwardingOutputStream according to our usual convention).

Original comment by cpov...@google.com on 30 Jan 2014 at 4:55

GoogleCodeExporter commented 9 years ago
Interestingly (and this came up before I think), FilterOutputStream's close() 
method actually has been fixed in JDK8, to:

try (OutputStream ostream = out) {
  flush();
}

which does not swallow the exception. I would like the idea of adding a 
ForwardingOutputStream class more if they hadn't done that, but it still might 
be reasonable to add.

Original comment by cgdecker@google.com on 30 Jan 2014 at 5:09

GoogleCodeExporter commented 9 years ago
That's what we ended up doing.  Aside from the exception issue in 
FilterOutputStream, the overall design is too coupled to 'filtering' the 
content, which is problematic for other use cases (such as rate limiting, size 
counting, etc.).  

For example, all write() variations funnel down to write( int ) - which is 
great for filtering byte-by-byte, but inefficient (and misleading) for other 
operations (example: the cost of rate limit checking on a byte-by-byte basis is 
excessive).  

Take this lovely sub-class use-case for limiting the number of bytes written:

write( byte[] b )
{
   checkSize( b.length );
   super.write( b );
}

write( byte[] b, int off, int len )
{
   checkSize( len );
   super.write( b, off, len );
}

write( int i )
{
   checkSize( 1 );
   super.write( i );
}

...when using FilterOutputStream, this results in incorrect size calculations 
as super.write( b ) calls write( b, 0, b.len ), which calls write( b[idx ] ).  
Subclasses of FilterOutputStream are forced to know far too much about the 
internals of the superclass to get the correct behaviour.

We ultimately settled on a simple forwarding model - all calls are forwarded 
straight through, with the exception of close() which also calls flush() 
(propagating exceptions from either). 

Original comment by ch...@digitalascent.com on 30 Jan 2014 at 5:19

GoogleCodeExporter commented 9 years ago
http://code.google.com/p/guava-libraries/source/detail?r=fa8103a40cab574bad32779
d45c38118adfed092

Original comment by cpov...@google.com on 19 Mar 2014 at 5:43

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<issue id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:10

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:07