mozilla / jydoop

Efficient Hadoop Map-Reduce in Python
Other
31 stars 19 forks source link

Expose HBase timestamps to the map function #51

Closed mreid-moz closed 7 years ago

mreid-moz commented 10 years ago

This also adds an hbase-specific setup helper function to reduce the amount of boilerplate required to set up an hbase scan.

As a minimal example:

import hbaseutils

def setupjob(job, args):
    hbaseutils.setup_full_scan_job("user_profile", job, args)
    job.getConfiguration().set("org.mozilla.jydoop.include_row_timestamp", "true")

def map(key, value, ts, cx):
    cx.write(key, ts)
mreid-moz commented 10 years ago

I merged in oyiptong's change to expose "getConfiguration" from the context.