To convert the TargetTextCollection into a CONLL formatted file. This function can have two options:
Just targets. Where the format will be BIO and no labels
labels. Where the format will be BIO and include labels e.g. B-POS, I-POS, B-NEG, I-NEG, and O.
Furthermore from these two options it would be good to have the option to include predictions and any number of predictions e.g. if you ran the same type of model multiple times to take into account random seeds.
The format of the CONLL file will be the following:
TOKEN#GOLD LABEL#PREDICTION 1# PREDICTION 2
Where the number of predictions can go up to N.
The signature of the function will be the following:
By defining the gold_label_key this in affect allows the user to define whether or not it is targets, labels or any other sequence labelling task as this will be defined by the value within gold_label_key in each TargetText within the TargetTextCollection
To convert the
TargetTextCollection
into a CONLL formatted file. This function can have two options:Furthermore from these two options it would be good to have the option to include predictions and any number of predictions e.g. if you ran the same type of model multiple times to take into account random seeds.
The format of the CONLL file will be the following:
TOKEN#GOLD LABEL#PREDICTION 1# PREDICTION 2
Where the number of predictions can go up to N.
The signature of the function will be the following:
to_conll(self, conll_fp: Path, gold_label_key: str, prediction_keys: Optional[List[str]] = None) -> None
By defining the
gold_label_key
this in affect allows the user to define whether or not it is targets, labels or any other sequence labelling task as this will be defined by the value withingold_label_key
in eachTargetText
within theTargetTextCollection