First, I apologize for the omnibus patch, but the change sets are highly
interdependent. If
only a (non-empty) subset are desired, I'll be happy to create a new patch that
includes only
that subset.
The attached patch implements the following fixes and improvements:
1. ensures that the <foreach> task is executed only once per unique output file
2. output files requiring update are sorted by full path and then updated in
sequence
3. new "parallelism" attribute on <foreach> enables per-rule limitation
concurrent subtasks
Here are the corresponding motivations:
1. If you have a rule with multiple input files that have the same basename but
different
extensions and an output file parameterized only by ${foreach:basename}, then the rule's
task ends up getting executed once per input file, but each time overwriting the same
output file.
2. This seems largely cosmetic, but makes it much easier to figure out progress
through a
large set of inputs.
3. The major use case we have for per-rule concurrency limitation is to limit
to one-at-a-
time, specifically when processing record updates to update some kind of summary database.
For our use case, #3 could alternatively be implemented as a boolean
"serialize" attribute,
but the "parallelism" attribute seemed a more general solution. Note that #3
is separate from
and in addition to overall concurrency management via -j. Specifically,
<foreach> tasks must
acquire first the per-rule semaphore (optionally) limited by the new
"parallelism" attribute,
and then acquire the global semaphore (optionally) limited by -j. Deadlock is
not possible
because the semaphores are always acquired in that order.
-peter
Original issue reported on code.google.com by petenewc...@gmail.com on 25 Aug 2011 at 11:24
Original issue reported on code.google.com by
petenewc...@gmail.com
on 25 Aug 2011 at 11:24