cooperative-computing-lab / cctools

The Cooperative Computing Tools (cctools) enable large scale distributed computations to harness hundreds to thousands of machines from clusters, clouds, and grids.
http://ccl.cse.nd.edu
Other
133 stars 114 forks source link

Makeflow syntax extensions #211

Closed btovar closed 8 years ago

btovar commented 10 years ago

(Converting pull request #141 into an issue)

Original Peter Bui description:

During the XSEDE Makeflow and Work Queue tutorial, a user mentioned that having the ability to glob or use some more of GNU Make's syntax features would make Makeflow more attractive.

This is have been a controversial topic for us and in the past we have resisted going down that path. That said, I believe this is a reasonable request, and went ahead and added three new syntax extensions:

wildcard substitutions: users can perform globbing by using the wildcard function:

MAKEFLOW_SOURCE = $(wildcard *.makeflow)

implicit variables: users can use implicit variables as placeholders for rule inputs and outputs:

output: input
command $^ > $@

static pattern rules: users can utilize template patterns that expand to a number of nodes:

$(wildcard *.makeflow): %.md5sum: %.makeflow
md5sum %^ > $@

For the most part, I tried to stay as close to GNU Make's syntax as possible. I've added three test cases (one for each of the new extensions).

Please let me know your thoughts on these extensions.

btovar commented 10 years ago

Note to self: Incorporate #312, fixing escaped newlines inside comments.

dthain commented 10 years ago

The problem here is no so much the syntax of how to do it, but rather, the semantics.

Makeflow is inherently a static DAG processor. That is, it generates a DAG according to the Makeflow file, and that representation is used to deal with job submission, logging, recovery from a checkpoint, automated manipulation of the DAG, and so forth.

If we add capabilities to make the DAG structure change with the contents of the filesystem, this will have spectacular consequences relative to all these capabilities. So, it is hard to see how to have wildcards within Makeflow.

Now, I could see wildcards and such being implemented outside of Makeflow, such that the expansion of the wildcards is done once, yielding a static DAG which can be manipulated in all the ways above. In some sense, that is the purpose of Weaver.

dthain commented 8 years ago

With apologies to to @pbui I'm closing this issue. We do need some new syntax, but what is needed is a structural macro, as opposed to pattern matching with the filesystem.