Hi, I am loading a JNLPBA dataset from the Flair library, and I would like to keep only the protein mentions, renaming them as "Gene." Additionally, I want to remove all other labels different from 'protein' in the dataset for training my gene REN model. However, when I go through the Flair documentation, I can't find a way to achieve my goal as all my attempts fail. Here is an example of the code I wrote.
`from flair.data import Sentence
def rename_and_remove_labels(sentence: Sentence):
new_labels = []
for label in sentence.get_labels():
if label.value == 'protein':
# Ajouter un nouveau label 'Gene' pour chaque label 'protein'
new_labels.append((label.data_point.start_position, label.data_point.end_position, 'Gene'))
sentence.remove_labels([label.value for label in sentence.get_labels()])
for start_pos, end_pos, new_label in new_labels:
span = sentence[start_pos:end_pos]
span.add_label(new_label)
return sentence
sentence = Sentence("IL-2 gene expression and NF-kappa B activation through CD28 requires reactive oxygen production by 5-lipoxygenase.")
sentence[0:2].add_label('ner', 'DNA')
sentence[4:6].add_label('ner', 'protein')
sentence[8:9].add_label('ner', 'protein')
sentence[14:15].add_label('ner', 'protein')
print("Avant :")
for label in sentence.get_labels():
print(label)
print(sentence)
sentence = rename_and_remove_labels(sentence)
print("\nAprès :")
for label in sentence.get_labels():
print(label)
`
Question
Hi, I am loading a JNLPBA dataset from the Flair library, and I would like to keep only the protein mentions, renaming them as "Gene." Additionally, I want to remove all other labels different from 'protein' in the dataset for training my gene REN model. However, when I go through the Flair documentation, I can't find a way to achieve my goal as all my attempts fail. Here is an example of the code I wrote.
`from flair.data import Sentence
def rename_and_remove_labels(sentence: Sentence):
sentence = Sentence("IL-2 gene expression and NF-kappa B activation through CD28 requires reactive oxygen production by 5-lipoxygenase.") sentence[0:2].add_label('ner', 'DNA') sentence[4:6].add_label('ner', 'protein') sentence[8:9].add_label('ner', 'protein') sentence[14:15].add_label('ner', 'protein')
print("Avant :") for label in sentence.get_labels(): print(label) print(sentence)
sentence = rename_and_remove_labels(sentence)
print("\nAprès :") for label in sentence.get_labels(): print(label) `