Test advanced model architectures

kparocki commented 1 year ago

Right now our only approach is a fairly simple neural net with word embeddings. The performance might increase if we apply a transformer-based architecture, contextualized embeddings, model fine-tuning etc.

Malthehave commented 1 year ago

We should use the same model architecture, as best as possible, between the three approaches that we are testing. The purpose is then that we can better compare the three approaches.

kparocki commented 1 year ago

We have the three model "pipelines" done, now we have to decide on the architecture. It seems complicated to import a pre-trained model from HuggingFace and integrate it into our pipeline, so it might be better to test other approaches, based solely on our data.

Ideas to implement:

[ ] Vanilla RNNs
[ ] Contextualised Embeddings
[ ] LSTMs / Bi-LSTMs

After evaluating the base predictions, the choice of the underlying architecture for the three model "pipelines" should be done using the same methodology as the base prediction model. So, only predicting a class or none, without two models in succession or gating. Then, after seeing which architecture performs best, we use it to test the pipeline approaches.

Hetling / NLP-second-year-project

Test advanced model architectures #2