Open whitejeep600 opened 10 months ago
The reason is that generate is for inference only, training requires a custom sampling logic
The reason is that generate is for inference only, training requires a custom sampling logic
Thank you for the quick reply.
However, what if I would just like to use the existing sampling algorithms from the generate function? It's a valid training strategy. It would be convenient to have access to the ready-made code for this purpose.
Feature request
The generate function is decorated with @torch.no_grad() and thus can't be used for model training. It would be better to make calculating gradients optional, rather than impossible, so that the function can be used for tuning. The simplest solution is to remove the decorator altogether, as users can set no_grad themselves before calling if they need to. Are there reasons to disable such usage?
Motivation
Allow using generate for tuning
Your contribution
Removing the decorator is a very simple change. I can submit a PR