There are a few cases that I noticed while writing tests for the abbreviation-expansion detection, that the system currently does not handle. We could add these to the regression tests once we swap out the spacy model with an advanced BERT-based model.
Case 1 : Abbreviation not in parenthesestext : "We use a Convolutional Neural Network, known as CNN, based architecture in this model, which is an improvement over state-of-the-art"
gold : [(Abbreviation : 'CNN', Expansion : 'Convolutional Neural Network')]
Case 2: Expansion in parenthesis
text = "GANs (Generative Adversarial Networks) outperform most generative models in the novel human face generation task."
gold = [(Abbreviation : GANs, Expansion : Generative Adversarial Networks)]
There are a few cases that I noticed while writing tests for the abbreviation-expansion detection, that the system currently does not handle. We could add these to the regression tests once we swap out the spacy model with an advanced BERT-based model.
Case 1 : Abbreviation not in parentheses text : "We use a Convolutional Neural Network, known as CNN, based architecture in this model, which is an improvement over state-of-the-art" gold : [(Abbreviation : 'CNN', Expansion : 'Convolutional Neural Network')]
Case 2: Expansion in parenthesis text = "GANs (Generative Adversarial Networks) outperform most generative models in the novel human face generation task." gold = [(Abbreviation : GANs, Expansion : Generative Adversarial Networks)]