stratosphere / incubator-systemml

Mirror of Apache SystemML (Incubating)
Apache License 2.0
1 stars 4 forks source link

Utility Implementation Meta Issue #3

Open fschueler opened 8 years ago

fschueler commented 8 years ago
FelixNeutatz commented 8 years ago

I implemented a first working version for TextCellToBinaryBlock. But there is a problem:

They use a flatMap for TextToBinaryBlockFunction. There they create a buffer and only submit records when the buffer is full. So I would need something like this:

private static class TextToBinaryBlockFunction extends RichFlatMapFunction<String,Tuple2<MatrixIndexes,MatrixBlock>>
{
    ReblockBuffer rbuff = null;
    Collector<Tuple2<MatrixIndexes,MatrixBlock>> outHandle = null;

    @Override
    public void open(Configuration parameters) {
        rbuff = new ReblockBuffer();
    }

    @Override
    public void flatMap(String text, Collector<Tuple2<MatrixIndexes,MatrixBlock>> out) 
    {
        outHandle = out;

        //flush buffer if necessary
        if (rbuff.getSize() >= rbuff.getCapacity()) {
            flushBufferToList(out, rbuff);
        }

        //add value to reblock buffer
        rbuff.appendCell(text);         
    }

    @Override
    public void close() throws Exception {
        //final flush buffer
        flushBufferToList(outHandle, rbuff);
    }
}

But the problem is that it seems that it is not possible to emit records in the close() method. Does anybody has an idea? Currently I just emit everytime which at least works ...

FelixNeutatz commented 8 years ago

MapPartition solves the problem :)

FelixNeutatz commented 8 years ago

I finished ReblockFLInstruction overall