Creating conversion spec stubs for all Nipype interfaces

nipype / nipype2pydra

Scripts for importing workflows written in Nipype to the Pydra dataflow engine syntax

Other

1 stars 2 forks source link

Creating conversion spec stubs for all Nipype interfaces #8

Open tclose opened 12 months ago

tclose commented 12 months ago

I have scraped the Nipype interfaces package to generate a list of all the available Nipype interfaces to generate a YAML file I stored in this repo under nipype-interfaces-to-import.yaml. After scraping the interface names, I slightly curated the list to get rid of base classes and generic interfaces (e.g. split, merge, etc...) that I didn't think were worth importing.

Next, I plan to create stub conversion specs (see example-specs/ants_registration_Registration.yaml for example) for all of these interfaces (in separate repos for each tool) as a starting point for the migration process. The idea is that a nipype2pydra port process would be set up on each repo to attempt to generate pydra tasks for each Nipype interface under a pydra.tasks.<pkg>.auto package.

Obviously, most of these interfaces would be unusable out of the box and their conversion specs would need manual editing before they would be useful. But I figure having something there as a starting point will make it much more likely someone will come along and do the manual steps. Once the conversion spec has been edited to a usable point, the idea would be to import the automatically generated interface from pydra.tasks.<pkg>.auto to pydra.tasks.<pkg>.v1 or pydra.tasks.<pkg>.latest to signify that is ready for use.

So my question is, are there any packages/interfaces in that list which are just not worth bothering at all (i.e. if the underlying tool is unsupported and broken). If so could you suggest to delete them from the list.

effigies commented 12 months ago

I think it could be worth checking in with Python packages about whether they'd prefer to host and maintain their own task packages.

I control quickshear, and would be fine either providing a pydra task or docs on how to trivially create one.
Nipype's dipy interfaces are generated from dipy functions, and we've had to bug @skoudoro to update them, when it would probably be easier for him to keep them up-to-date in dipy itself.
nipy is probably not worth pulling forward, as it is basically unmaintained. We had to remove it from installation so that nipype tests could pass on CI.
I would be curious about @arokem's thoughts on nitime.

A useful thing could be to write a template GitHub action that would install pydra (with/without --pre) and run pytest on pydra.tasks.<pkg>, to minimize the barrier to including task packages as part of another project.

tclose commented 12 months ago

I think it could be worth checking in with Python packages about whether they'd prefer to host and maintain their own task packages.

I control quickshear, and would be fine either providing a pydra task or docs on how to trivially create one.

Nipype's dipy interfaces are generated from dipy functions, and we've had to bug @skoudoro to update them, when it would probably be easier for him to keep them up-to-date in dipy itself.

It would be great to have the interfaces maintained in the package repos where possible.

In those cases you mentioned, do you expect it would be easier just to create new Pydra interfaces from scratch, or would the first step still be to do the auto-convert?

If it is auto-convert, perhaps we could start off by having a separate repo while we are still ironing out the converter tool (and potentially the interface syntax) and then approach the tool maintainer to include them in their repo when they are stable (ie no longer auto generated).

nipy is probably not worth pulling forward, as it is basically unmaintained. We had to remove it from installation so that nipype tests could pass on CI.

I would be curious about @arokem's thoughts on nitime.

A useful thing could be to write a template GitHub action that would install pydra (with/without --pre) and run pytest on pydra.tasks.<pkg>, to minimize the barrier to including task packages as part of another project.

I'm not sure I follow

skoudoro commented 12 months ago

Nipype's dipy interfaces are generated from dipy functions, and we've had to bug @skoudoro to update them, when it would probably be easier for him to keep them up-to-date in dipy itself.

I agree

A useful thing could be to write a template GitHub action that would install pydra (with/without --pre) and run pytest on pydra.tasks., to minimize the barrier to including task packages as part of another project.

I do not understand. Can you develop the idea ?

effigies commented 12 months ago

In those cases you mentioned, do you expect it would be easier just to create new Pydra interfaces from scratch, or would the first step still be to do the auto-convert?

I suppose if we have it as a template to start from, authors might find it useful.

A useful thing could be to write a template GitHub action that would install pydra (with/without --pre) and run pytest on pydra.tasks., to minimize the barrier to including task packages as part of another project.

By this I meant I meant we could create a test-pydra-tasks.yml that authors could drop into their project with minimal changes in order to test their Pydra tasks. I think it would be important to make sure that it tests against the latest release as well as any pre-releases so that we can get feedback about breaking changes.

tclose commented 12 months ago

By this I meant I meant we could create a test-pydra-tasks.yml that authors could drop into their project with minimal changes in order to test their Pydra tasks. I think it would be important to make sure that it tests against the latest release as well as any pre-releases so that we can get feedback about breaking changes.

Generating automatic tests is pretty hard to do properly isn't it? I'm trying to do this with MRtrix, but that is only possible as I'm able to hook into their argument parsing source code.

effigies commented 12 months ago

I have a model in my head where the tools are adding pydra.tasks.<pkg> into their own source directories, but not adding pydra as a dependency. Their current tests stay the same, pytest [OPTIONS] <pkg> or whatever. Then we give them a workflow that does whatever setup we think makes sense, and then runs pytest [OPTIONS] pydra/tasks/<pkg>, and maybe mypy.

They'll still need to write the tests that make sense for their tasks, but won't need to worry about figuring out the CI. For people already doing exactly this, it won't be a huge benefit, though it will be one thing less for them to do. But a lot of projects still have a mix of AppVeyor, CircleCI, Travis, etc, and might not be running pre-release tests. For them, it could make the prospect of supporting pydra tasks less of a burden.

arokem commented 12 months ago

I would be curious about @arokem's thoughts on nitime.

I think this would not be too difficult to implement within nitime, and I don't anticipate this being a big burden in terms of maintenance, because nitime API/functionality is not likely to change much (based on the fact that it hasn't changed much in the last 10 years 😄 ), so I am open to this.

tclose commented 11 months ago

I have a model in my head where the tools are adding pydra.tasks.<pkg> into their own source directories, but not adding pydra as a dependency. Their current tests stay the same, pytest [OPTIONS] <pkg> or whatever. Then we give them a workflow that does whatever setup we think makes sense, and then runs pytest [OPTIONS] pydra/tasks/<pkg>, and maybe mypy.

They'll still need to write the tests that make sense for their tasks, but won't need to worry about figuring out the CI. For people already doing exactly this, it won't be a huge benefit, though it will be one thing less for them to do. But a lot of projects still have a mix of AppVeyor, CircleCI, Travis, etc, and might not be running pre-release tests. For them, it could make the prospect of supporting pydra tasks less of a burden.

They should be able to use the pythonpackage.yaml GHA workflow in the tasks template repo with only a few modifications to do that (I have switched it over to use hatchling so we can be flexible with the VCS root).

Re auto-generating tests, what are your thoughts on a testing framework that would work similar to pytest-timeout but pass instead of fail after the timeout, in order to check whether the command survives the setup phase of the tool. I'm thinking it should be possible to supply arbitrary test data based on the data/file types and the value bounds.