[Help] 长文本推理OOM

Wohoholo commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

在A100 80G的单显卡上半精度推理，text length=25000就OOM，有人有这种情况吗？寻求长文本处理优化方法

Expected Behavior

No response

Steps To Reproduce

model = Model.from_pretrained() model.generate(input, kwargs)

Environment

- OS:rethat
- Python:python3.8
- Transformers:4.28.0
- PyTorch:1.13.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True

Anything else?

No response

HongyuJiang commented 10 months ago

试试把输入切分成多个chunk，然后对每个chunk进行summary，将所有的summary拼接到一起再喂给模型或许可以曲线救国

Wohoholo commented 10 months ago

谢谢，这个类似于mapreduce的方法有尝试过了。还有一种解决方法是进行一次长文本处理，torch gc后，等待GPU碎片释放后再进行下一条文本处理。