Open effigies opened 3 years ago
i think it would be good to discuss which ones and if there are alternative approaches in pydra. and then we can discuss where.
for example, i think rename is being built into the spec and we will want to put datasink also into the spec. identityinterface is no longer required.
Data grabbers and data sinks were the main things I was thinking about. But here's a list:
XNAT, BIDS, etc make sense not to put directly in pydra of course.
thanks. these should be relatively easy to move over, since most are just python functions. we should decide where they should go pydra.tasks.core.io/utility
so core is a package that only pydra provides.
I'm debugging a pydra workflow from pydra-glm-example and I'm thinking about nipype interface - SelectFile
. I believe we should discourage using Nipype1Task
, but just creating a FunctionTask
before we create pydra.SelectFile
as suggested here.
I should create some examples, am I right that SelectFiles
is mostly used as a connection from infosource
with iterables
?
I think it could be easily hooked up with iterables
, but I don't know that it's "mostly used" that way. I haven't really used it, so I don't know for sure how others use it, and I've generally avoided iterables, so take that for what it's worth.
conceptually selectfiles
is just a simple interface to getting data, whether that is connected to infosource or not is up to each workflow creator. the reason why infosource/inputnode (both are identityinterfaces) is used is for dataflow purposes, which should not be required in the context of pydra's design (which makes a workflow a tasks and splits can be applied to any inputs).
but do we want to create pydra.SelectFiles
? It's very easy to create a python function and just add splitter
i think we could have a set of utility functions that are general purpose across many use cases. but only if they are clear and prevents recreating the same code in many different workflows. if you put it in pydra, i would label the tasks as experimental in the sense that they could be moved out.
selectfiles is generally a non-cacheable function since it involves taking a look at folder that could have changed between runs and we may not want to necessarily hash the input directory. i think we need to be able to at least indicate that even if we end up not creating the function. so hashability should also be taken into consideration. perhaps think about how users would create/use such a function and see if it's better to provide one that reduces some of these complications.
A lot of the things in
nipype.interfaces.io
andnipype.interfaces.utility
would be useful to have around. Should we be making a task package for that or bundling directly intopydra.tasks
?