MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.31k stars 21.5k forks source link

batchcount is missing from json example and parallel section is slightly misleading / inaccurate #74644

Closed jasonhorner closed 3 years ago

jasonhorner commented 3 years ago

in the following section: https://docs.microsoft.com/en-us/azure/data-factory/control-flow-for-each-activity#parallel-execution

If isSequential is set to false, the activity iterates in parallel with a maximum of 20 concurrent iterations. This setting should be used with caution. If the concurrent iterations are writing to the same folder but to different files, this approach is fine. If the concurrent iterations are writing concurrently to the exact same file, this approach most likely causes an error.

the text should be amended to say a default of 20 concurrent iterations this value is controlled by the batchcount property which has a maximum value of 50

also the json at the top should be amended thusly (to include the batchcount property):

{  
   "name":"MyForEachActivityName",
   "type":"ForEach",
   "typeProperties":{  
      "isSequential":"true",
       "batchCount": 20,
        "items": {
            "value": "@pipeline().parameters.mySinkDatasetFolderPathCollection",
            "type": "Expression"
        },
      "activities":[  
         {  
            "name":"MyCopyActivity",
            "type":"Copy",
            "typeProperties":{  
               ...
            },
            "inputs":[  
               {  
                  "referenceName":"MyDataset",
                  "type":"DatasetReference",
                  "parameters":{  
                     "MyFolderPath":"@pipeline().parameters.mySourceDatasetFolderPath"
                  }
               }
            ],
            "outputs":[  
               {  
                  "referenceName":"MyDataset",
                  "type":"DatasetReference",
                  "parameters":{  
                     "MyFolderPath":"@item()"
                  }
               }
            ]
         }
      ]
   }
}

Finally, a note should be added that if you run the execute activity within the foreach loop with out setting the waitOnCompletion property the behavior will effectively become parallel and behave as if is sequential is set to false.

also important to call out that a failing activity within the foreach container will not cause a termination of the loop. this behavior is unexpected especially for people coming from SSIS

more info here: https://andyleonard.blog/2020/06/one-way-to-break-out-of-an-azure-data-factory-foreach-activity/

and a connect item here: https://feedback.azure.com/forums/270578-data-factory/suggestions/39673909-foreach-activity-allow-break


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

KrishnaG-MSFT commented 3 years ago

@jasonhorner Thanks for your comment! We will review and provide an update as appropriate.

KranthiPakala-MSFT commented 3 years ago

@jasonhorner Thanks for the feedback! We have assigned the issue to the content author to further review this and update the document as appropriate.

jonburchel commented 3 years ago

Thanks for reporting this issue and helping us improve the docs, @jasonhorner. I confirmed the doc has been updated to reflect this now!

please-close