stratosphere / incubator-systemml

Mirror of Apache SystemML (Incubating)
Apache License 2.0
1 stars 4 forks source link

Implement other output formats #19

Closed FelixNeutatz closed 8 years ago

FelixNeutatz commented 8 years ago

Completing:

FelixNeutatz commented 8 years ago

I have two issues with completing WriteFLInstruction:

  1. The possibility to set the option to output one single file:
dataset.output(new TextOutputFormat<String>(new org.apache.flink.core.fs.Path(fname)));
MapReduceTool.mergeIntoSingleFile(randFName, fname);

The problem is that the merging happens before the execute statement. Therefore there are no files created yet. So the merging fails. We can discuss it on Monday.

  1. I also need the Flink specific MatrixObject like Felix implemented

Apart from this, the following works now: writing csv, text, mm, binary

fschueler commented 8 years ago

I have a branch with some of the IO stuff for flink that can be merged into our staging next week.

Given that we have to keep track of defined sinks I think we can start thinking about what methods to add to the FlinkExecutionContext that wraps the ExecutionEnvironment. We could define methods there that call execute() if necessary.

In general the whole IO stuff is spread across the MatrixObject and ExecutionContext and I am not sure about how they decided on what goes where...