Closed lalalune closed 1 month ago
Review our teacher forcing strategy.
One idea that might be interesting is to set it to 1 - loss. So we force until out loss below 0, then start to back off. By the end the model shouldn't care about sequence order.
Review our teacher forcing strategy.
One idea that might be interesting is to set it to 1 - loss. So we force until out loss below 0, then start to back off. By the end the model shouldn't care about sequence order.