Open potash opened 10 years ago
We officially recommend to avoid using tags as outputs when you can. Per the spec:
While this could serve as a good transitionary vehicle from a linear workflow, using output files is a highly preferred way to establish step dependencies. Using tags makes the workflow less flexible and more error prone, hardcodes file locations into commands, and skips a variety of features Drake provides (base directory, automatic step selection, data backups and reverts, and others).
If I'm reading you right, your first step writes to a database and you don't technically need to track an output file for that so you're putting a tag there instead. In this case, one simple workaround would be to have that first step, as a final action, create an output file anyway. Perhaps put some logging info in there so it's not totally useless. Then you get the standard benefits of Drake's timestamp checking.
I see. It occured to me to use a dummy file to represent the last time the command was run but that will be problematic in a collaborative environment if that file doesn't get synced (but the database and raw data files are shared).
Suppose I have the following Drakefile:
When I modify
input
and rundrake
, I get "Nothing to do.". I would like Drake see that there is a path in the tree with newer input than output. Is there a reason why it wouldn't do that? This is related to my other issue #151 because oftentimes I want to do several steps in a database (technically no-output) and then dump to a file.