multiply-org / multiply-core

The core functionality of the MULTIPLY platform
2 stars 5 forks source link

Tiling #5

Open bulli92 opened 7 years ago

bulli92 commented 7 years ago

Currently no tiling is foreseen in the dummy version. This is o.k. for the beginning.

The question I have is where the tiling is done in the end. This is in particular also relevant for the parallelization. I think, deciding this quite soon is important as it has implications on the structure of the overall code development.

Like I see it, the dummy version could represent the processing for a single workflow. Thus it receives the coordinates of the target area and does the entire processing for this area.

The engine calling this processor (our current dummy) would then be responsible to split the processing of a larger area into different chunks and distribute across different computing nodes as well as collect the results again (map-reduce).

If this is the baseline, then we don't need to think about parallelization at all for the dummy, as this will be done on a higher level in the end.

@TonioF @barsten to comment.

TonioF commented 7 years ago

While it is correct that tiling is currently not explicitly mentioned in the dummy version (or any other part of the project for that matter), it is yet present when SNAP operators are used for the pre-processing, as the graph processing framework partitions products into several tiles.

To answer this question, it is once more important to define the actual interface. Does a pre-processor receive a single product as input? Or shall it fetch the data itself, based on the configuration? What is the output of a pre-processor? What is the required input of the inference engine (and the high res pre-processing). We need to answer these more precise than in the current IODDs. These answers have a major impact on the tiling, too.