twitter / summingbird

Streaming MapReduce with Scalding and Storm
https://twitter.com/summingbird
Apache License 2.0
2.14k stars 267 forks source link

Add `modifyBeforeWrite` extention point in `VersionedBatchStore` #771

Closed ttim closed 5 years ago

ttim commented 5 years ago

In order to better support different GDPR use cases in Twitter we need to have a way to do modification for stored (K, V) pairs in Summingbird's Scalding platform.

In this PR I've added modifyBeforeWrite protected method which suits this needs in VersionedBatchStore.

codecov-io commented 5 years ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (develop@433c93f). Click here to learn what that means. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             develop     #771   +/-   ##
==========================================
  Coverage           ?   71.49%           
==========================================
  Files              ?      151           
  Lines              ?     3613           
  Branches           ?      209           
==========================================
  Hits               ?     2583           
  Misses             ?     1030           
  Partials           ?        0
Impacted Files Coverage Δ
...mmingbird/scalding/store/VersionedBatchStore.scala 59.25% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 433c93f...d3a3834. Read the comment docs.

ttim commented 5 years ago

@johnynek are you ok with this change?

johnynek commented 5 years ago

totally fine with me.

Sorry for the delay.

Our data is small enough that we just recompute everything every so many days, I know this isn't practical for you.