Open odashi opened 2 years ago
@neubig @pfliu-nlp RFC
Hi, @odashi thanks for the proposal! Overall they look very nice both in terms of code maintenance and interoperability. Just sharing some comments:
plugin feature
"Achieves better separation of interests: the "core" library can focus on changes for only core parts, and the "plugin" can focus on extensible parts, such as one task."
If we define the separation
by task, wouldn't we suffer from the following 2nd and 3rd issues that you have listed at the beginning of this proposal?
I agree it would be nice if we could define a task as a class with member functions such as loaders and processors.
(Then finally, we will have several powerful and well-defined classes: datasets
, tasks
, metrics
, which almost define NLP.
I'm fine with better alternatives to the registries.
@pfliu-nlp Yes, we would need to work for 2, 3, then 1.
datasets, tasks, metrics, which almost define NLP.
This is a good point, but we also need to keep in mind that the "task" in this repository may confuse the users because it is not the actual process of the task itself (e.g., translating source to target in machine translation tasks).
Maybe we also need to determine "protocols" (shared data format) between datasets, tasks, and metrics, to achieve better separation.
For example,
The dataset defines "available datatype" $D_a$ that the dataset can provide:
class FooDataset(Dataset):
def get_available_datatype(cls):
return [("ref", list[list[str]]), ("hyp", list[str])] # list of (name, type)
The task defines "required datatype" $D_r$ that the task requires:
class BarTask(Task):
def get_required_datatype(cls):
return [("ref", list[list[str]]), ("hyp", list[str])] # list of (name, type)
The main routine compares both datatypes (explainaboard --dataset Foo --task Bar
。If $D_r \subset D_a$ the program determines they can be connected.
This approach can mitigate a strict tying between the dataset and the task, and could allow users to add a new dataset/task without knowing more than what the "datatype" provided. Maybe some similar approach can be introduced between task and metric.
I'm looking through old issues in github, and just wanted to say that I like this direction a lot and I think we're moving towards it gradually. A more strict definition of "required datatypes" would also be useful in other places of the code as well, like when we define feature functions, etc.
Background
This repository hosts all task/metric definitions that the ExplainaBoard handles, and it seems we face several problems to maintain the current development manner:
Since the objective of the ExplainaBoard is to host as many tasks as possible, the problems above will become more serious by increasing the number of tasks (and metrics).
Proposals
Introducing plugin feature
Tt's time to consider introducing the plugin feature. Python has an ability to list packages programatically and some libraries (e.g., flask, pytest) utilizes it to import additional functionalities from other package.
We can use this functionality to define the "extensible" parts onto separate repositories. This brings us several advantages in terms of maintenance:
Changing the structure around "task"
To achieve this change, we also need to standardize the definition of "task". The current repository has the
Task
class, but it holds only a description associated to a name and there are no any relation to other part of the repository. Most information related to the "task" is actually categorized byTaskType
, and their definition is distributed to multiple parts of the repository (e.g., multiple registries that takesTaskType
). If theTask
class represent enough information of the "task", we can consolidate these definitions. Specifically, I am considering to change the structure around the task from:to:
Here I also removed
TaskType
since managing the list of tasks as Enum has issues for extensibility (enums have strict typing and should not be used as a variable collection).Avoid registries
The current repository relies on several implementation of global registry. Registry basically involves many technical disadvantages and should be avoided unless it is really required:
In most cases, registry is not necessary to achieve the same behavior. There are mainly two use-cases of the registry on this repository below:
If we need to collect the functionalities that are implemented on the same interface, we can introduce a common syntax to notify the list of functionalities. For example, we can introduce following syntax to
__init__.py
of the plugin package:and we can collect these definitions programatically: