Open kparocki opened 1 year ago
We should use the same model architecture, as best as possible, between the three approaches that we are testing. The purpose is then that we can better compare the three approaches.
We have the three model "pipelines" done, now we have to decide on the architecture. It seems complicated to import a pre-trained model from HuggingFace and integrate it into our pipeline, so it might be better to test other approaches, based solely on our data.
Ideas to implement:
After evaluating the base predictions, the choice of the underlying architecture for the three model "pipelines" should be done using the same methodology as the base prediction model. So, only predicting a class or none, without two models in succession or gating. Then, after seeing which architecture performs best, we use it to test the pipeline approaches.
Right now our only approach is a fairly simple neural net with word embeddings. The performance might increase if we apply a transformer-based architecture, contextualized embeddings, model fine-tuning etc.