Open raronson opened 11 years ago
Russel, would that be a duplicate of https://github.com/Factual/drake/issues/41 or https://github.com/Factual/drake/issues/50?
Ping.
Sorry for the delayed response.
Im not sure i understand the difference between #41 and #50. Based of the proposal you wrote in #41, aren't wildcard inputs supported?
I would like to specify a glob pattern which is similar to (as far as i can tell a subset of) regex. I dont want the pattern to be expanded in drake, other then for a file recency check, as this could expand to thousands of files. It seems that #41 would do this expansion, so i think this is different.
I think it's a duplicate of #50 then. Since this issue has more info, I'll close #50.
I kinda think all of these pattern/glob feature requests are duplicates of #41, which is immensely difficult. I'd love to close this one, but will leave it up to @aboytsov.
Since #41 is potentially difficult to do, then perhaps being able to use a list variable would be helpful?
listvar=['foo', 'bar', 'other']
output <- $listvar
tail -n +2 ${listvar}.ext1 > ${listvar}.ext2 ; Copy everything but the first line
Where, drake would be smart enough to know that listvar
is a list and run the tail -n +2
command on each item in the list?
Persuant to the desires for globbing and/or regex for inputs/output, maybe list variables could be populated in with a drake step?
listvar <-
listvar = `ls *.tgz`
I'm not convinced I have proposed a good syntax in the above examples, but hopefully it is enough to give others ideas on ways to make globbing or regex easier to implement?
An example is if you had a directory structure like
logs/year/month/part-files
and wanted to only process jan - mar from every year. the pattern would belogs/*/0[1-3]