clarin-eric / switchboard-tool-registry

The Switchboard Tool Registry
GNU General Public License v3.0
4 stars 13 forks source link

improved task description #81

Closed dietervu closed 3 years ago

dietervu commented 3 years ago

Fixes a typo and brings this task closer to the section "Named Entity Recognition"

andmor- commented 3 years ago

Tasks are a controlled vocabulary. See error and current possible values: https://github.com/clarin-eric/switchboard-tool-registry/runs/1478817180?check_suite_focus=true#step:5:102

emanueldima commented 3 years ago

I originally thought that tasks being a controlled vocabulary is a good idea. But because we so often change them I now see it as a mistake. In the tool format version 2 I took that out, so any string task is allowed.

andmor- commented 3 years ago

Wouldn't this potentially lead to the proliferation of tasks? Even a typo would place the tool in a different task category. I am not sure but in principle I think that a maintained controlled vocabulary is better for this.

emanueldima commented 3 years ago

I would say it's easier to check at review time than to have the hassle of editing the schema file for every other tool, to add a new task. I personally had to do it multiple times, and Dieter now has to do it again. It's a tradeoff. 🤷

andmor- commented 3 years ago

I understand. Still, another argument from the original discussions: this provides a strong incentive to try to place the tool in a existing category. Without it, the editor will probably just go the easy way and consider every category a new category (apart from typos). This can then lead to problems as in the component registry where people sometimes to not remember or agree with the name of the existing item and create a new one. e.g. personally I would not have realized these 3 should be the same: "lexical analysis vs "lexing" vs "tokenization", reference