Open djarecka opened 5 years ago
The AFNI team wants to develop their tests in tcsh. I have attempted to create a specification that would allow overlap with testkraken.
We should discuss...
https://docs.google.com/document/d/13P4S6ZhF6K0Ho7wTqeKXHp-hged8EXTbElFQthW3Fg0/edit?usp=sharing
@leej3 - seems like coming up with a set of working examples prior to november hack may be a good target and then to discuss with the broader afni crowd at the hack.
When you say working examples do you mean a selection of yaml files executed in test-kraken?
Also, I'm wondering can we tentatively agree on a schema: even if its a straw-man I'd like to target that as I fill in the various examples that span our usage needs (or indeed do you already have a schema/object structure in mind). Specific details that I was wondering about yaml specification for testkraken:
do you mean a selection of yaml files executed in test-kraken?
yes or some shim between yaml and testkraken
can we tentatively agree on a schema: even if its a straw-man
sure - go for it. whatever you and @djarecka find reasonable.
how might one specify dependencies between tests
this would be closer to a dataflow framework that specifies how outputs from a prior test goes into a subsequent test. this would be equivalent to a workflow specification. one possibility is to consider CWL. the other is simply to say how do i specify dependencies.
the best way to specify environment variables
i would use that as part of the environment specification (software/libraries + environment variables). for now i think we are using the neurodocker spec, which does have a way to add environment variables. now, if these are specific to the test side rather than the container side, we can override or add environment variables on the test side.
You mentioned a tree in the spec doc. Would this be a tree object to represent collections of tests within Python. Would this facilitate inheriting environments/variables etc from closer to the root of the tree?
we can use the inheritance principle when feasible, but it could be that all tests run in all environments unless otherwise restricted (just like we restrict certain versions of libraries in python setuptools)
I will come back to this tomorrow, but have some questions
I'm not completely sure about the "dependencies between tests". I was thinking that we should have scientific_workflow
-> tests
. Is it necessary to have scientific_workflow
-> test1
-> test2
?
we are not using exactly neurodocker
spec, but similar. And we don't use the container for testing right now.
Ok, sounds like a good goal.
this would be closer to a dataflow framework
Yes. I'm happy to explore the use of pydra for this. A lot of our tests will take a while to run. I think specifying dependencies between tests and skipping tests as dependencies fail would be very useful. It will require a cost benefit analysis though. Having some clear examples of this being used will aid discussion. Overall, it would be nice to create a list at some point (perhaps at the hack) of the advantages of such an approach over:
sounds good regarding environment variables and tree inheritance.
Is it necessary to have scientific_workflow-> test1 -> test2?
I think it might just be a terminology thing. We can explore what terms we want to use. Using your terms I think it would be more along the lines of
scientific_workflow_1
-> scientific_workflow_2
And scientific_workflow_2 would only run if the former succeeded (with success possibly assessed by some extra commands run to check the output, not just the exit status).
The summery of the discussion with @leej3 about the spec that would be more useful for afni testing: https://docs.google.com/document/d/17DHlnNzKl6rAhC-NlLRi0IKAbUodR3wtATcOXvwvmiU/edit?usp=sharing
please commend here or in the doc