Closed rcannood closed 2 years ago
Going over all directives to determine how they should be managed.
This format in Nextflow DSL:
process foo_process {
<nextflow dsl>
}
is equivalent to the following in the viash config:
platforms:
- type: nextflow
directives:
<viash config>
and is also equivalent to the following in viash + nextflow DSL:
foo_process(
directives: [
<viash + nextflow dsl>
]
)
Note: Should clojures in viash+nxf dsl be interpreted? E.g. directives: [ "cache": { ... }, "label": "foo" ]
?
The order in which directives get resolved (in order of decreasing priority):
foo_process(directives: ...)
- { type: nextflow, directives: ... }
type | code |
---|---|
Nextflow DSL | accelerator 4, type: 'nvidia-tesla-k80' |
Viash config | accelerator: "4, type: 'nvidia-tesla-k80'" |
Viash + Nextflow DSL | "accelerator": "4, type: 'nvidia-tesla-k80'" |
type | code |
---|---|
Nextflow DSL | afterScript "source /foo/bar/script" |
Viash config | afterScript: "source /foo/bar/script" |
Viash + Nextflow DSL | "afterScript": "source /foo/bar/script" |
type | code |
---|---|
Nextflow DSL | beforeScript "source /foo/bar/script" |
Viash config | beforeScript: "source /foo/bar/script" |
Viash + Nextflow DSL | "beforeScript": "source /foo/bar/script" |
type | code |
---|---|
Nextflow DSL | cache false |
Viash config | cache: false |
Viash + Nextflow DSL | "cache": false |
-- | -- |
Nextflow DSL | cache "deep" |
Viash config | cache: deep |
Viash + Nextflow DSL | "cache": "deep" |
Possible values: false / true / "deep" / "lenient"
Note that Viash might need to convert yaml booleans into strings during parsing.
Not supported at this stage. Contact maintainers or create a new issue support would ever be needed.
Not supported in favour for linking to other viash platforms (e.g. native, docker).
Not supported in favour for linking to other viash platforms (e.g. native, docker).
type | code |
---|---|
Nextflow DSL | cpus 8 |
Viash config | cpus: 8 |
Viash + Nextflow DSL | "cpus": 8 |
type | code |
---|---|
Nextflow DSL | clusterOptions xxxx |
Viash config | clusterOptions: xxxx |
Viash + Nextflow DSL | "clusterOptions": "xxxx" |
type | code |
---|---|
Nextflow DSL | disk '2 GB' |
Viash config | disk: "2 GB" |
Viash + Nextflow DSL | disk: "2 GB" |
Must match <decimal> [KMGT]?B
type | code |
---|---|
Nextflow DSL | echo true |
Viash config | disk: true |
Viash + Nextflow DSL | "disk": true |
type | code |
---|---|
Nextflow DSL | errorStrategy "terminate" |
Viash config | errorStrategy: terminate |
Viash + Nextflow DSL | "errorStrategy": "terminate" |
Possible values are 'terminate', 'finish', 'ignore', 'retry'
@tverbeiren Did I forget something? I'm going to use the content of this issue to make a blog post on viash.io.
This functionality was released in 0.5.11 :partying_face:
I created a repository,
viash_nxf_poc
, to work out a POC for a NextFlowPlatform rewrite. I'm proposing this rewrite to fix some of my annoyances with the current way of working, but also to reduce the code complexity and add more checks in order to avoid bugs (which currently occur quite regularly).Channel Interface
A Viash+Nextflow module generated by Viash has the interface:
These fields are defined as follows:
id
(String
) is a unique identifier for the event in the Channel.inputs
(Map[String, Object]
orFile
) is a named map containing the component's input parameters. Examples of the class types associated with different Viash component arguments:{ name: foo, type: string, direction: input }
[ foo: "bar" ]
String
{ name: int, type: integer, direction: input, multiple: true }
[ int: [ 1, 2, 3 ] ]
List[Integer]
{ name: bool, type: boolean, direction: input, required: false }
[ bool: null ]
Boolean
ornull
{ name: bool, type: boolean, direction: input, required: false }
[ bool: true ]
Boolean
ornull
{ name: in, type: file, direction: input }
[ in: file("in.h5ad") ]
File
{ name: out, type: file, direction: output }
[ out: "proposed_path.h5ad" ]
String
{ name: out, type: file, direction: output, multiple: true }
[ out: "proposed_path_*.h5ad" ]
String
If you only want to specify a single input file, you can simply pass a
File
instead of aMap[String, Object]
....passthrough...
(Object*
) are objects that simply get passed through to the output. This is a practical solution for reading a bunch of parameters in at the start of a workflow and putting it into the inputs slot whenever they need to be consumed. This means that an event in the channel can be of length N where N >= 2.outputs
(Map[String, File]
orFile
) is a named map containing the component's output files. If the component outputs only a single File, the outputs will be a File rather than a named map.Module usage
Given a Viash component named
poc
(src/poc/config.vsh.yaml
), importing the module yields a Nextflow Workflow which can be used as follows:Viash+Nextflow modules are flexible
The strength of the new Viash+Nextflow modules lies in its flexibility in how you want to use the module.
directives
: One on one mapping with the Nextflow process directives. NOTE: You can pass clojures, but they need to quoted, see example below. Examples:container: "bash:4.2"
label: ["bigmem", "bigcpu"]
publishDir: [ path: "output/", mode: "copy", saveAs: "{ "prefix_" + it }" ]
(← saveAs is a quoted closure)auto
: Helper arguments provided by Viash.simplifyInput
: Iftrue
, an input tuple only containing only a single File (e.g.["foo", file("in.h5ad")]
) is automatically transformed to a map (i.e.["foo", [ input: file("in.h5ad") ] ]
)simplifyOutput
: Iftrue
, an output tuple containing a map with a File (e.g.["foo", [ output: file("out.h5ad") ] ]
) is automatically transformed to a map (i.e.["foo", file("out.h5ad")]
)publish
: Iftrue
, the module's outputs are automatically published toparams.publishDir
. Will throw an error ifparams.publishDir
is not defined.transcript
: Iftrue
, the module's transcripts are automatically published toparams.transcriptDir
. If not defined,params.publishDir + "/_transcripts"
will be used. Will throw an error if neither are defined.Chaining multiple modules
If each module only has one input file and output file:
If the modules have multiple input / output files per step:
map
: Apply a map over the incoming tuple. Example:{ tup -> [ tup[0], [input: tup[1].output], tup[2] ] }
.mapId
: Apply a map over the ID element of a tuple (i.e. the first element). Example:{ id -> id + "_foo" }
mapData
: Apply a map over the data element of a tuple (i.e. the second element). Example:{ data -> [ input: data.output ] }
mapPassthrough
: Apply a map over the passthrough elements of a tuple (i.e. the tuple excl. the first two elements). Example:{ pt -> pt.drop(1) }
renameKeys
: Rename keys in the data field of the tuple (i.e. the second element). Example:[ "new_key": "old_key" ]
debug
: Whether or not to print debug messages. Example:true
Reuse same module
You can run the same component multiple times. For reasons, you need to specify a unique key every time the module is used.