vatlab / sos

SoS workflow system for daily data analysis
http://vatlab.github.io/sos-docs
BSD 3-Clause "New" or "Revised" License
274 stars 45 forks source link

Allow dynamic target to resolve to no file? #1329

Open BoPeng opened 4 years ago

BoPeng commented 4 years ago

Right now we allow for

dynamic('a.txt')
dynamic(['a.txt', 'b.txt'])
dynamic('*.txt')

and we resolve them to

'a.txt'
'a.txt', 'b.txt'
pattern matching.... '*.txt'

and the returned target might not be invalid (e.g. a.txt does not exist).

For the handling of -e ignore (or some other scenario), perhaps it makes sense to resolve it always as

[x for glob.glob(name) if os.path.exists(x)]

that is to say

  1. In case of *.txt, we allow for match nothing through pattern match.
  2. In case of a.txt, we also "match" it, so it returns [] if a.txt does not exist.
BoPeng commented 4 years ago
[10]
output: dynamic('A.bak')
_output.touch()

[20]
input: 'A.bak'
depends: sos_step(10)
output: 'B.txt'
_output.touch()

At least in this case, the behavior is incorrect for dynamic used for output:.

Edit: this is not a valid case because _output cannot be used with dynamic().

BoPeng commented 4 years ago
[10]
output: dynamic('*.bak')
path('A.bak').touch()

[20]
input: 'A.bak'
depends: sos_step(10)
output: 'B.txt'
_output.touch()

works due to sos_step(10).

import time

[10]
output: dynamic('*.bak')
time.sleep(5)
path('A.bak').touch()

[20]
input: 'A.bak'
output: 'B.txt'
_output.touch()

does not work because sos will try to execute 20 while 10 is still running.

Using dynamic('A.bak') for step 20 solves the problem but not sure if this is the best approach.

import time

[10]
output: dynamic('*.bak')
time.sleep(5)
path('A.bak').touch()

[20]
input: dynamic('A.bak')
output: 'B.txt'
_output.touch()
BoPeng commented 4 years ago

input: dynamic(mnm_high_het_input_files) , group_by = 3,

Just quote a case that may be it is not a good idea to remove targets that does not exist from dynamic because it will mess up the grouping of inputs.