Open claus-zinn opened 7 years ago
I see the following potential problems:
We assume the repositories should be able to send multiple inputs to the Switchboard. Should we allow mimetype/language be specified for each of the inputs? This implies changing the Switchboard API.
Should the Switchboard allow the user somehow to get one input from one repository and another input from another repository?
Should the user be allowed to add their own input to another one that comes from a repository, making a multiple input from a single input? What's the new UI complexity?
The service API must also be changed, tools must be allowed to describe multiple inputs.
If we'll ever get batch processing, how would it work with multiple inputs? Is a tool that supports batched processing the same thing as a tool that takes an unlimited number of inputs of the same type?
The possibility for multiple inputs is also something that CLAM supports. So currently not all CLAM services would be expressible for the switchboard, only the simpler ones.
The CLAM webservice specification is accordingly expressive to accommodate this (multiple input templates) and may provide some inspiration if you guys want to go this way, though larger initiatives such as openapis.org may also be worth checking out and can offer the same.
If we'll ever get batch processing, how would it work with multiple inputs? Is a tool that supports batched processing the same thing as a tool that takes an unlimited number of inputs of the same type?
I'd say, if the 'unlimited' number of inputs is specified in a single invocation, then that would indeed be a tool that supports batch processing. CLAM does that.
When it comes to multiple inputs, a distinction should also be made between multiple inputs at the same time (which is mostly what this issue is about), and multiple independent routes through a webservice (which is also addressed in #65).
To be considered in #4
Voyant seems to support uploading multiple files through repeated input
parameters to create a multi-file corpus.
Example:
Note: we also send the media type as a parameter, which may or may not be possible for multiple inputs; in any case, it doesn't appear to be mandatory (see example above)
There are tools that require more than a single resource, e.g., tools that align audio with text, see
curl -v -X POST -H 'content-type: multipart/form-data' -F LANGUAGE=deu-DE -F TEXT=@<filename> -F SIGNAL=@<filename> 'https://clarin.phonetik.uni-muenchen.de/BASWebServices/services/runMAUSBasic'
In the LRS standalone version, the UI must allow users to upload two (or more) files. The matcher algorithm must be extended to cope with the additional complexity.
The issue is more complex from the VLO perspective. For the time being, only a single file can be transferred to the LRS (but this could be an archive containing the files). Any solution must take the batch processing issue into account.