arnaud-morvan commented 7 years ago

QGIS Enhancement: Processing modeler iterators

Date 2017/10/25

Author Arnaud Morvan (@arnaud-morvan)

Contact arnaud dot morvan at camptocamp dot com

maintainer @arnaud-morvan

Version QGIS 3.2

Summary

This proposal is about adding, in processing, a new "Iterators" functionality. This would help to automate batch processing.

Differents iterators may be proposed, for example:

iterate over layers in a group
iterate over features in a layer
iterate over files in a folder
iterate in a for/while loop

For now there is an "Iterate over this layer" button near the source selection in algorithm execution dialog. When activated, the corresponding algorithm is executed one time for each feature in source. This is implemented on UI side.

Here, the idea is that iterators may be usable in the processing models. Each iterator should have input parameters (source folder, group of layers) and output values (current file / layer). Iterator's output values may be used as expression variables in child algorithms input/output parameters.

Not that this could also replace the batch dialog.

Proposed Solution

Create a new abstract QgsProcessingModelIterator class, extending QgsProcessingAlgorithm :

class CORE_EXPORT QgsProcessingModelIterator : public QgsProcessingAlgorithm
{
    Flags flags() const
    {
      return FlagHideFromToolbox | FlagIterator;
    }

    /**
     * Move forward
     */
    virtual void next();
}

Native iterators should be implementated in core in c++, but it should be possible to create custom iterators using Python.

We will have to adapt QgsProcessingModelAlgorithm class to support iterators.

In this case we have to handle multiple results for some child algorithms. So we may add a key in that map, with a list of QVariantMap, one for each iteration, containing the results of iterator dependant child algorithms for that iteration.

Example 1: Buffers with multiple distances

Suppose we want to run buffer algorithm many times on the same source vector layer with different distance values (5, 10, 15, 20).

processing_iterator

Example 2 : Apply same transformation on all files in a folder, keeping the files layout

Suppose we want to apply the same transformations on all layers in a folder.

../source/layer1.shp => add field => .../destination/layer1.shp
../source/layer2.shp => add field => .../destination/layer2.shp

In such case the output path need to be calculated depending on source folder (model destination forlder input value) and source layer name (folder_iterator ouput value accessible as a variable), for example : @destination_folder/@folder_iterator_current_filename).

So we need to allow use of expressions in output destination parameter values. Note that it will also help to store default output values in model.

Example 3: Create PostGIS layers with all files in a folder depending on their name.

Suppose we want to create PostGIS layers with files in the following layout:

.../source/folder1/layer1.shp
.../source/folder1/layer2.shp
.../source/folder2/layer1.shp
.../source/folder2/layer2.shp

And we want features to be added to a PostGIS layer depending on the source file name.

Such use case could be handled by inserting features in an exising layer, so this is in the scope of QEP 89: Feature writing in existing layers (https://github.com/qgis/QGIS-Enhancement-Proposals/issues/89).

Output / Returns considerations

A processing model containing such iterator should return something like:

{
  "native:forloop_1": [
    "native:buffer_1:buffered": "output...",
    "native:buffer_1:buffered": "output...",
    "native:buffer_1:buffered": "output...",
    "native:buffer_1:buffered": "output...",
  ]
}

Note that QgsProcessingContext support layersToLoadOnCompletion and addLayerToLoadOnCompletion methods that should be used here.

Affected Files

QgsProcessingModelAlgorithm.processAlgorithm may be deeply affected.

Performance Implications

Nothing precise here for now.

Further Considerations/Improvements

It may be complex to support mutiple iterators in the same processing model.

Backwards Compatibility

(required)

Issue Tracking ID(s)

(optional)

Votes

(required)

nyalldawson commented 7 years ago

The idea sounds good to me - obviously there's not much technical details here, but that could potentially wait for the PR.

None of the processing model code is exposed as stable API, so we can safely mash it up in any form we want after 3.0.

arnaud-morvan commented 6 years ago

I've done a prototype for processing iterators : https://github.com/arnaud-morvan/QGIS/tree/processing_iterators.

This does not alter existing processing API for the moment.

As I need parameter's values to initialize the iterator, I do it in iterator's prepareAlgorithm.

I run the iterator's processAlgorithm function to get result values for expression context as many times as QgsProcessingModelIterator::next() return true.

I've deeply altered the QgsProcessingModelAlgorithm::processAlgorithm orchestration with use of a new recusive function named processChildAlgorithm.

I do not manage to get results correctly returned to Python because of:

TypeError: unable to convert a C++ 'QVariantMap' instance to a Python object

But I do not know SIP very well.

We may also need to use iterator's result values in output filename definition (using expressions ?).

roya0045 commented 5 years ago

I'm interested in resurrecting this to integrate in the modeler and the standalone processer of Mr. Dawson if possible.

roya0045 commented 5 years ago

I'm not too familiar with the modeler code or the processing plugin in general but would there be a way to control the loops in the flow of the code?

Potentially the standalone will have a simple accumulator (a list) that could gather the results of the loops and then merge it if needed, but how would one do that in the model builder and how would the flow be controlled to stop looping at a given point instead of looping everything downstream of the looper.

@arnaud-morvan Were the flags implemented to control this behaviour or was it for something else?

arnaud-morvan commented 5 years ago

Note that I've done a prototype here: https://github.com/qgis/QGIS/compare/master...arnaud-morvan:processing_iterators The main work is in QgsProcessingModelAlgorithm::processChildAlgorithm. For me it does not really make sense to use an iterator standalone. This is why I put flag FlagHideFromToolbox on iterator algorithms.

roya0045 commented 5 years ago

@arnaud-morvan Yes I am aware of the prototype, I have made some quick draft of the cpp code based on your implementation.

I agree that the standalone itself is not of much use but possibly in the standalone processor it might be of use depending on the implementation. Otherwise in the modeler it is still needed.

hansmei commented 4 years ago

This would be a neat feature! I'm looking forward to getting my hands on this when it gets ready.

arnaud-morvan commented 4 years ago

As far as I know funding has been cancelled on this feature for the moment.

Saijin-Naib commented 3 years ago

@arnaud-morvan ,

Are you still interested in this? I think this has incredible value and utility, and I am quite interested in seeing this come to the QGIS ecosystem.

roya0045 commented 3 years ago

@Saijin-Naib This is not a light feature to implement. This would require a good amount of rework of processing models. The simplest alternatives are scripts and models that can be batched well.

Saijin-Naib commented 3 years ago

@roya0045 ,

That sounds fair, but I think not having this is a pretty big barrier to entry for people who might want to take advantage of the Model Builder.

Telling people who are just getting their start with graphical programming to just script something is a non-starter in my opinion. They're not the target for writing custom scripts, not yet.

This is their gateway to that eventuality (maybe!)

roya0045 commented 3 years ago

I agree, but I'm saying that this is the sort of thing that should require funding to be properly implemented, and without that any development will take time.

Saijin-Naib commented 3 years ago

Perfectly reasonable. Do you have a ballpark for how much or how big the fund needs to be? I'm curious if we couldn't garner enough support to do a community-led funding drive.

roya0045 commented 3 years ago

@Saijin-Naib You'd have to contact Nyall for such a thing. ;)

Saijin-Naib commented 3 years ago

Hehe, that's exactly who I had in mind... Good to know I was on the right path.

I will do so and update here to see if we can get some movement on this, or if perhaps we need to wait for other library or QGIS updates before this is sensible to tackle.

roya0045 commented 3 years ago

Its possibly already on his list to be honest.

qgis / QGIS-Enhancement-Proposals

Processing modeler iterators #108

QGIS Enhancement: Processing modeler iterators

Summary

Proposed Solution

Example 1: Buffers with multiple distances

Example 2 : Apply same transformation on all files in a folder, keeping the files layout

Example 3: Create PostGIS layers with all files in a folder depending on their name.

Output / Returns considerations

Affected Files

Performance Implications

Further Considerations/Improvements

Backwards Compatibility

Issue Tracking ID(s)

Votes