Open jiqicn opened 2 years ago
The Table validator nodes seem very useful and close to what we need. It seems there is no compatibility with the CSV on the web specification, which is a more fomalized and interoprable way of describing the CSV table than with a reference table or within the configuration.
Here is an existing csv-on-the-web validator in Ruby that could be used for a customized node.
Also a python implementation (not well documented and seems to be python 2) and an R implementation.
In case we want to use the Ruby implementation, here is a ruby wrapper node for Knime. This works with jruby, a java implementation of Ruby. It's not clear to me how this wrapper nodes handles the ruby environment and dependencies, I think jruby and all dependencies already need to be installed on the system. See also documentation of the jruby node.
I also found a CSV on the Web validator in java that might be useful for developing the customized node.
At bottom of this note, you can find implementations of csvw validator in different languages (Python, Ruby, Javascript, Web, and R).
Note that the csvlint (ruby) implementation also has a webservice with an API: http://csvlint.io/documentation
Another python implementation, built in Clariah: https://github.com/clariah/cow
As this note said, there are two ways of sharing the extension.
One is to build a local update site for the extension. The ideal situation is to become a contributor to the KNIME community, but that requires many efforts (see this link). It's also possible to have the local update site shared in different ways (e.g. github), but in that case, a dropin will be more convenient than a local update site.
The second way is to wrap the extension as a dropin, which is actually a .jar file. When deploying the extension, users simply needs to put the dropin file in the dropins
folder of the local KNIME installation and restart KNIME.
Investigate possible ways of validating input data against the ontology.