We can annotate single and multi-label by passing an id2label mapping or a list of labels. However, we should use labels in natural language rather than label IDs. This involves transforming all label IDs to their corresponding natural language form. Once flattened, a function doing this for you would be excellent. This also involves a reverse function to convert these labels back to their IDs. Further, we can extend the relatively short abbreviations to more expressive descriptions.
[x] Function to convert dataset with label IDs to their natural language form.
[x] Function to postprocess generated dataset with natural language labels back to its IDs.
[x] Function to convert flattened NER datasets which are split into tokens and tags into strings and spans.
[x] Function to convert strings and spans back into tokens and tags.
[x] Function to calculate offsets in strings for postprocessing, also relevant for QA.
We can annotate single and multi-label by passing an id2label mapping or a list of labels. However, we should use labels in natural language rather than label IDs. This involves transforming all label IDs to their corresponding natural language form. Once flattened, a function doing this for you would be excellent. This also involves a reverse function to convert these labels back to their IDs. Further, we can extend the relatively short abbreviations to more expressive descriptions.