Closed hello-eternity closed 2 years ago
Do you mean training or inference? For training, you may want to reduce the patch size and the sequence length to reduce memory consumption. For inference, you can try to separate the sequences into shorter segments. Using max_seq_len may help.
Thanks a lot, your answer helps me, I mean inference
When training video material, the memory and video memory resources are always insufficient. Is there any parameter to solve this problem? almost 1min .mp4 25fps, 3MB, running on the 12GBRAM 12GvRam, will meet resources lack or how should I deal with the input video before to more easily run the code