Closed lalalune closed 3 months ago
Instead of trying to predict all of the tokens at once, we should predict some, keep the ones with high confident, and try again with a modify attention mask.
Instead of trying to predict all of the tokens at once, we should predict some, keep the ones with high confident, and try again with a modify attention mask.