CWSL / cwsl-mas

VisTrails plugin for Climate Model Analysis
Apache License 2.0
6 stars 32 forks source link

Added fldcor, timcor and ensemble aggregation wrappers #35

Closed DamienIrving closed 9 years ago

DamienIrving commented 9 years ago

@captainceramic Here's some simple modules that use multiple input files. Before merging this PR I need your advice on:

captainceramic commented 9 years ago

Sure - vt_ensemble_agg.py should take in one DataSet for input, and any Constraint names you want to aggregate over should be overwritten in the ProcessUnit - so add a Constraint('institute', 'ensemble') and Constraint('model', 'ensemble') (or perhaps do a in_dataset.get_constraint('model') and mash the values together, but I think that might be too long for a filename).

For vt_timcor.py and vt_fldcor.py they should take two DataSets as input, one for each input file. Use the new merge_output keyword in the ProcessUnit to merge the institute and model constraints together. However, until we can get the test_model_correllation_3 test to work, this won't work properly - this case where you have two Constraints, each with multiple values and the desired behavior is to create every possible combination of values is not working.

It's a pretty hard one to solve too, as every combination of models is linked with a combination of institute values. This is one situation when not using the same pattern for input and output makes it a lot more simple.

I have a couple of ideas of how to do it (involving combining the merged constraints for the output, then splitting them apart again to find the correct matching files for the input)

DamienIrving commented 9 years ago

@captainceramic Cool, thanks. One more question: How do I make vt_ensemble_agg.py pass lots of files to the command line entry generator at once? (i.e. cdo_ensemble_agg.sh accepts infile1, infile2, ... infileN)

captainceramic commented 9 years ago

Overwriting the model and institute constraints should result in all of the files being passed in at once - if they don't you have found a bug! What output do you get at the moment?

DamienIrving commented 9 years ago

Ok, so it is passing all the files at once.

The bug appears to be when I run a workflow where in_dataset has more than one experiment (e.g. rcp45 and rcp85). VisTrails basically just freezes at the Ensemble Aggregation step and eventually stops the incomplete process with no error message or anything. It works fine if there is only one experiment, but fails as soon as there is more than one.

I'll merge this PR now (as I don't think the problem is in my wrapper) and log an issue.