fullcontact / hadoop-sstable

Splittable Input Format for Reading Cassandra SSTables Directly
Apache License 2.0
49 stars 14 forks source link

Job tuning -- memory #3

Closed cjwooo closed 10 years ago

cjwooo commented 10 years ago

How much memory are you guys giving to each map task? Our use case involves multiple 20-40GB SSTables and we can't seem to get around the Java heap space error.

bvanberg commented 10 years ago

Right now we are running with 2GB heaps for our jobs, but I think we could run with less. Can you pass along some more information? Perhaps the configuration you are using to run your jobs along with any errors you might be seeing in logs. Glad to help you figure this out.

Ben.

On Monday, August 25, 2014, Chris Wu notifications@github.com wrote:

How much memory are you guys giving to each map task? Our use case involves multiple 20-40GB SSTables and we can't seem to get around the Java heap space error.

— Reply to this email directly or view it on GitHub https://github.com/fullcontact/hadoop-sstable/issues/3.

cjwooo commented 10 years ago

Here's our job config: https://gist.github.com/chris-tempoai/00024a5ed8447d16da59#file-hadoop-emr-xml Keyspace and column family are set elsewhere. Our workflow is similar to yours except we use snapshots from nodetool instead of Priam and we run the data analysis directly on the SSTables instead of an intermediate format.

Our jobs fail within 5 minutes, and the only errors we're getting are ---- Error: Java heap space attempt_201408252319_0002_m_000000_0: SLF4J: Class path contains multiple SLF4J bindings. attempt_201408252319_0002_m_000000_0: SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.7.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201408252319_0002_m_000000_0: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201408252319_0002/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201408252319_0002_m_000000_0: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. attempt_201408252319_0002_m_000000_0: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

We had to edit the SSTableRecordReader because we used Thrift statements instead of CQL: create column family ** with comparator = 'UTF8Type' and default_validation_class = 'UTF8Type' and key_validation_class = 'UTF8Type' and compression_options={sstable_compression:SnappyCompressor, chunk_length_kb:64};

Our edit to SSTableRecordReader:https://gist.github.com/chris-tempoai/00024a5ed8447d16da59#file-sstablerecordreader-java

I've stripped down one of our jobs to a bare minimum non-expensive analysis and no help.

Thanks a bunch for your help and great work on this project.

bvanberg commented 10 years ago

This may highlight an issue with the documentation, or lack thereof, but have you run the indexer prior to running the mapreduce jobs?

https://github.com/fullcontact/hadoop-sstable/blob/0d1c79656617a7b8c12b0127b8c2d12ed2e84083/sstable-core/src/main/java/com/fullcontact/sstable/index/SSTableIndexIndexer.java

On Tue, Aug 26, 2014 at 12:31 PM, Chris Wu notifications@github.com wrote:

Here's our job config: https://gist.github.com/chris-tempoai/00024a5ed8447d16da59#file-hadoop-emr-xml Keyspace and column family are set elsewhere. Our workflow is similar to yours except we use snapshots from nodetool instead of Priam and we run the data analysis directly on the SSTables instead of an intermediate format.

Our jobs fail within 5 minutes, and the only errors we're getting are ---- Error: Java heap space attempt_201408252319_0002_m_000000_0: SLF4J: Class path contains multiple SLF4J bindings. attempt_201408252319_0002_m_000000_0: SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.7.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201408252319_0002_m_000000_0: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201408252319_0002/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class] attempt_201408252319_0002_m_000000_0: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. attempt_201408252319_0002_m_000000_0: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

We had to edit the SSTableRecordReader because we used Thrift statements instead of CQL: create column family ** with comparator = 'UTF8Type' and default_validation_class = 'UTF8Type' and key_validation_class = 'UTF8Type' and compression_options={sstable_compression:SnappyCompressor, chunk_length_kb:64};

Our edit to SSTableRecordReader: https://gist.github.com/chris-tempoai/00024a5ed8447d16da59#file-sstablerecordreader-java

I've stripped down one of our jobs to a non-expensive analysis and no help.

— Reply to this email directly or view it on GitHub https://github.com/fullcontact/hadoop-sstable/issues/3#issuecomment-53467698 .

cjwooo commented 10 years ago

Yes, and I set the SSTable split size to 512 to match the MR job parameters.

bvanberg commented 10 years ago

This may be difficult to debug without seeing the jobs, output, etc. Any chance you can fire over your job.xml?

On Tue, Aug 26, 2014 at 12:59 PM, Chris Wu notifications@github.com wrote:

Yes, and I set the SSTable split size to 512 to match the MR job parameters.

— Reply to this email directly or view it on GitHub https://github.com/fullcontact/hadoop-sstable/issues/3#issuecomment-53471712 .

cjwooo commented 10 years ago

We figured it out. Problem with our data. Thanks for the help though!

bvanberg commented 10 years ago

Glad to hear it! If you do end up with any additional tunings/optimizations we'd love to hear about them.

Thanks again,

Ben.

On Thu, Sep 4, 2014 at 4:45 PM, Chris Wu notifications@github.com wrote:

We figured it out. Problem with our data. Thanks for the help though!

— Reply to this email directly or view it on GitHub https://github.com/fullcontact/hadoop-sstable/issues/3#issuecomment-54556176 .