genericworkflownodes / GenericKnimeNodes

Base package for GenericKnimeNodes
https://github.com/genericworkflownodes/GenericKnimeNodes
Other
15 stars 16 forks source link

It's impossible to use "Input file" nodes without input files #116

Open iprotsyuk opened 8 years ago

iprotsyuk commented 8 years ago

In the workflows, I use several of nodes of this kind, and the logic of the workflow implies that some input datasets can be empty. However, "Input files" seems to be not usable if I don't select any file.

I realize that it might be a low-priority feature for you guys to implement, so I can do it myself and make a pull request if you confirm that this feature makes sense for other users as well.

jpfeuffer commented 8 years ago

That is a good question. It might be possible to emulate this behaviour somehow with already present tools. Can you elaborate a bit more on your use case? For example: Do the datasets actually have to be empty "physical" files on your disk or can they be URIs of files that would be created in a later process?

iprotsyuk commented 8 years ago

I'd be glad if there's a simple bypass of this restriction.

In my particular case, I built a workflow for LC-MS data processing. And those "Input files" nodes are used to supply files with samples to the workflow. The thing is there're at least two groups of samples: normal ones and blank ones, and I use two separate input nodes for them. The presence of blanks isn't necessary, so I'm looking for a possibility to just not select any file for that node.

The issue is totally silly, but the usage of "Input files" nodes very convenient and natural for other people who use that workflow. So, my current workaround for the situation when no input files are provided is selecting a file with a predefined name as an input file, which is excluded from the further processing afterwards. The main downside of this approach is that I have to explain it to each and every user of the workflow because they are mainly biologists who disregard reading any documentation.

jpfeuffer commented 8 years ago

Ok, I see your problem. It is actually possible to do a little workaround with the KNIME Workflow Control Nodes (i.e. a "Boolean Input" from the user if he has blanks and an "If Switch" node that deactivates the branch with the "Input file(s)" node if this boolean is false). However, there might be a problem, because you eventually have to merge your results to compare them (I guess). I am not sure which of the OpenMS nodes can handle empty input file ports (if any). You can create empty File Ports e.g. by using the "Create Table" node with an empty String column and then converting it to a URI Port via "String To URI" and "URI to Port".

Nonetheless, I see that more nodes (which you could and should collapse in a Metanode) might scare users of your pipeline and I will discuss the issue of allowing an empty "File Input" to be at least configured successfully with the other GKN/OpenMS/SeqAn developers.

iprotsyuk commented 8 years ago

Yeah, the last option would be perfect.

On top of it, we're going to describe some workflows in a scientific paper, and chances are we won't pass a review with any kind of kludges.

iprotsyuk commented 8 years ago

Hi @jpfeuffer I wonder if there're any updates on the issue.

jpfeuffer commented 8 years ago

We wondered how you would want to handle the empty URI Port in downstream nodes? That is the thing we are a bit aware of: it might also be confusing for beginning users if suddenly a downstream node fails with an uncatched error because a port is empty.

iprotsyuk commented 8 years ago

Normally, I use a couple "Port to URI -> URI to String" to convert paths of input files to a KNIME table and process it afterwards. Not sure how these nodes react upon an empty input. Perhaps, they need to be fixed for this particular case along with "Input File(s)" nodes.

Indeed, it can be confusing for some users that downstream nodes fail with empty input, but I don't think it's incorrect behaviour. Some nodes can handle empty input, others can't. It should be kept in mind by workflow developers. Does it make sense?