Open fschueler opened 8 years ago
I implemented a first working version for TextCellToBinaryBlock. But there is a problem:
They use a flatMap for TextToBinaryBlockFunction. There they create a buffer and only submit records when the buffer is full. So I would need something like this:
private static class TextToBinaryBlockFunction extends RichFlatMapFunction<String,Tuple2<MatrixIndexes,MatrixBlock>>
{
ReblockBuffer rbuff = null;
Collector<Tuple2<MatrixIndexes,MatrixBlock>> outHandle = null;
@Override
public void open(Configuration parameters) {
rbuff = new ReblockBuffer();
}
@Override
public void flatMap(String text, Collector<Tuple2<MatrixIndexes,MatrixBlock>> out)
{
outHandle = out;
//flush buffer if necessary
if (rbuff.getSize() >= rbuff.getCapacity()) {
flushBufferToList(out, rbuff);
}
//add value to reblock buffer
rbuff.appendCell(text);
}
@Override
public void close() throws Exception {
//final flush buffer
flushBufferToList(outHandle, rbuff);
}
}
But the problem is that it seems that it is not possible to emit records in the close() method. Does anybody has an idea? Currently I just emit everytime which at least works ...
MapPartition solves the problem :)
I finished ReblockFLInstruction overall