fsprojects / Amazon.SimpleWorkflow.Extensions

Extensions to AmazonSDK's SimpleWorkflow capabilities to make it more intuitive to use
http://fsprojects.github.io/Amazon.SimpleWorkflow.Extensions
MIT License
16 stars 10 forks source link

Add feature to scatter-gather with unique per activity input #39

Open gitfool opened 9 years ago

gitfool commented 9 years ago

Maybe I'm missing something, but I can't see how to pass different input per activity.

For example, say I want to get a list of URLs in the first activity (preprocess step), then in parallel for each unique URL process the URL with a second activity type (passing the unique URL as input to each activity), then after all these activities have run aggregate / postprocess the results.

theburningmonk commented 9 years ago

That's a good idea and something I considered doing a long time ago, what stopped me was the thought that it would turn this into map-reduce on the SWF service whereas it's perhaps better suited to EMR.

Have you considered EMR for this work?

gitfool commented 9 years ago

EMR is overkill for my requirements, which are better matched by SWF. Also, I want to use C#, but the SWF C# API is severely lacking, hence the appeal of Amazon.SimpleWorkflow.Extensions.

The only problem now is Amazon.SimpleWorkflow.Extensions abstracts away too much, and I don't have time to learn F# to add features as I need them, so I may not be able to use it. ;(

theburningmonk commented 9 years ago

Is the number of elements in the list fixed? If so you can work around it by having the previous activity (let's say A1) followed by an array of parallel activities (B1..BN) each picking one of the items from the list to process and passing in a 'reducer' (R) to aggregate all the results into a single value result which is then passed onto the next activity/childworkflow (C1).

A1 ++> (B1..BN, R) ++> C1

You can find an example for parallel activities with aggregator here

gitfool commented 9 years ago

No, the number of elements is coming from a database query and is variable.