Closed DamienIrving closed 9 years ago
Sure - vt_ensemble_agg.py
should take in one DataSet
for input, and any Constraint
names you want to aggregate over should be overwritten in the ProcessUnit
- so add a Constraint('institute', 'ensemble')
and Constraint('model', 'ensemble')
(or perhaps do a in_dataset.get_constraint('model')
and mash the values together, but I think that might be too long for a filename).
For vt_timcor.py
and vt_fldcor.py
they should take two DataSets
as input, one for each input file. Use the new merge_output
keyword in the ProcessUnit
to merge the institute
and model
constraints together. However, until we can get the test_model_correllation_3 test to work, this won't work properly - this case where you have two Constraints
, each with multiple values and the desired behavior is to create every possible combination of values is not working.
It's a pretty hard one to solve too, as every combination of models
is linked with a combination of institute
values. This is one situation when not using the same pattern for input and output makes it a lot more simple.
I have a couple of ideas of how to do it (involving combining the merged constraints for the output, then splitting them apart again to find the correct matching files for the input)
@captainceramic Cool, thanks. One more question: How do I make vt_ensemble_agg.py
pass lots of files to the command line entry generator at once? (i.e. cdo_ensemble_agg.sh
accepts infile1, infile2, ... infileN)
Overwriting the model
and institute
constraints should result in all of the files being passed in at once - if they don't you have found a bug! What output do you get at the moment?
Ok, so it is passing all the files at once.
The bug appears to be when I run a workflow where in_dataset
has more than one experiment (e.g. rcp45
and rcp85
). VisTrails basically just freezes at the Ensemble Aggregation
step and eventually stops the incomplete process with no error message or anything. It works fine if there is only one experiment, but fails as soon as there is more than one.
I'll merge this PR now (as I don't think the problem is in my wrapper) and log an issue.
@captainceramic Here's some simple modules that use multiple input files. Before merging this PR I need your advice on:
vt_timcor.py
andvt_fldcor.py
take onein_dataset
or two?vt_ensemble_agg.py
should take just onein_dataset
, but I wasn't sure what to do with the%model%
and%institution%
constraints (you'll see there's aFIXME
in the code)?