We previously ruled out attention because it seemed that we didn't need it to get 99% accuracy and wanted to minimize model complexity. However, we've since learned that accuracy is a bad metric for our dataset. Thus, we should see if we can get a better F1 score with an attention mechanism.
We previously ruled out attention because it seemed that we didn't need it to get 99% accuracy and wanted to minimize model complexity. However, we've since learned that accuracy is a bad metric for our dataset. Thus, we should see if we can get a better F1 score with an attention mechanism.