Closed ayush055 closed 5 months ago
Hi, I was just wondering if there have been any attempts at using a Transformer instead of the Bi-LSTM or utilizing attention in the network to potentially improve results?
Hi, I was just wondering if there have been any attempts at using a Transformer instead of the Bi-LSTM or utilizing attention in the network to potentially improve results?