Closed GabrielKP closed 2 years ago
@leonhardhennig 3 Options:
you're right, and I'm beginning to think Christoph has redone all this in his newish repo (that he and Arne may publish soon) - there is a "Task" class which kind of replaces the feature_converter which does all this. (also the handling of special tokens as in issue #42).
I'd tend to option 1 currently since I think this is lower prio. I'd rather have the Rel Classification tested and running so that we can pre-label the Businesswire/PLASS corpus as soon as possible
I agree, although not optimal, this is workable, I will leave it as it is.
The function
get_additional_tokens
extracts additional tokens dependent on the task. It currently is implemented in DatasetReader, but DatasetReader should be task-agnostic. It would make more sense to implement get_additional_tokens in the initialization of the feature_converter, where each feature_converter is made for exactly one task.