NICTA / scoobi

A Scala productivity framework for Hadoop.
http://nicta.github.com/scoobi/
482 stars 97 forks source link

configureCompression only creates compressor once (fixes compression problems with Deflate and Snappy) #352

Closed jbeynon closed 9 years ago

jbeynon commented 9 years ago

Since 0.8.5 I noticed that compression was broken when using Avro but was never bothered enough to look into it. With the latest 0.9.0 release I finally took some time and found the issue. The problem manifests it as either an OutOfMemoryError or yarn killing tasks for going "beyond memory limits" when using Deflate or Snappy and isn't specific to Avro, only noticed because Avro silently changes GZip to Deflate.

2014-11-21 20:00:33,234 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Direct buffer memory
    at java.nio.Bits.reserveMemory(Bits.java:658)
    at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
    at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
    at org.apache.hadoop.io.compress.snappy.SnappyCompressor.<init>(SnappyCompressor.java:82)
    at org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:147)
    at com.nicta.scoobi.core.Compression$$anonfun$getCompressor$1.apply(DataSink.scala:95)
    at com.nicta.scoobi.core.Compression$$anonfun$getCompressor$1.apply(DataSink.scala:95)
    at com.nicta.scoobi.impl.control.Exceptions$class.trye(Exceptions.scala:106)
    at com.nicta.scoobi.impl.control.Exceptions$.trye(Exceptions.scala:128)
    at com.nicta.scoobi.core.Compression$.getCompressor(DataSink.scala:95)
    at com.nicta.scoobi.core.DataSink$$anonfun$configureCompression$2.apply(DataSink.scala:55)
    at com.nicta.scoobi.core.DataSink$$anonfun$configureCompression$2.apply(DataSink.scala:54)
    at scala.Option.filter(Option.scala:181)
    at com.nicta.scoobi.core.DataSink$class.configureCompression(DataSink.scala:54)
    at com.nicta.scoobi.io.text.TextFileSink.configureCompression(TextFileSink.scala:38)
    at com.nicta.scoobi.io.text.TextFileSink.configureCompression(TextFileSink.scala:38)
    at com.nicta.scoobi.impl.plan.mscr.MscrOutputChannel$$anon$1$$anonfun$write$1.apply(OutputChannel.scala:172)
    at com.nicta.scoobi.impl.plan.mscr.MscrOutputChannel$$anon$1$$anonfun$write$1.apply(OutputChannel.scala:171)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at com.nicta.scoobi.impl.plan.mscr.MscrOutputChannel$$anon$1.write(OutputChannel.scala:171)
    at com.nicta.scoobi.core.EnvDoFn$$anon$1.emit(EnvDoFn.scala:38)

This stacktrace from explicitly using Snappy is what helped me. Basically the issue is that DataSink.configureCompression is being called for every emit from EvnDoFn. It creates a compressor to test that the settings are working and then promptly discards it. The problem with this is that the Deflate and Snappy compressors create off-heap buffers using ByteBuffer.allocateDirect and this memory does not get GC'd as you'd expect. So for each emit you get a 64kb (for Snappy, not sure the buffer size for Deflate) memory leak.

Anyway, so I added a simple check in DataSink so that configureCompression only actually acts once and things seem to work now. I haven't run a full regression but this is a pretty mild change.

By the way, this would have been much easier to diagnose because configureCompression has a debug log in it and you'd see in the logs thousands of log messages, but I couldn't get debug output working in the latest build. No matter what I tried I either got default logging or no logging.

jbeynon commented 9 years ago

@etorreborre Can you please take a look at this? It's been a major blocker in upgrading past Scoobi 0.8.3 and is necessary for anyone using Avro and compression.

markhibberd commented 9 years ago

@jbeynon I have re-triggered the build, to see if it goes green this time. If so, I will merge.

xelax commented 9 years ago

Thanks mark! and if you could publish 0.9.1 it would be awesome ;-)

etorreborre commented 9 years ago

Released now (sorry for not having been very reactive with the merge, thanks @markhibberd!).

jbeynon commented 9 years ago

No worries. I know how easy it is for notifications to get lost in the noise.

xelax commented 9 years ago

Eric and Mark, thank you! about the release: I believe you forgot to publish the 2.10 version: https://oss.sonatype.org/content/repositories/releases/com/nicta/scoobi_2.11/ has 0.9.1 but it is missing in: https://oss.sonatype.org/content/repositories/releases/com/nicta/scoobi_2.10/

etorreborre commented 9 years ago

Alex, the release is now available for 2.10.