Open Myson850 opened 1 month ago
Set batch_size to e.g. 32
(multiple of 8 and pow 2 encouraged) for better usage. Sorry, but your request does not make much sense. YOu also have computation graphs in torch.compile that make your proposed feature very unattractive.
Feature request
Release gpu memory after a certain number of calls
Motivation
After setting the --batch-size of the embed model to 100, I tried to call the data with a batch size of 80. It succeeded, but then I called the data with a batch size of 10 many times. The GPU occupied the video memory and did not release or lower
Your contribution
.