Closed Kaka23333 closed 1 year ago
This may be useful: https://github.com/PaddlePaddle/Knover/issues/159#issuecomment-1265071448
Thanks for your in time reply!
This solution works for me. The title of this issue is not very clear so I didn't pay attention to it before. 🥲
Hi, thanks for you impressive work.
I'm currently trying to deploy PLATO-XL service on RTX3090. The deployment is successful, however, I'm only able to input no more than 3 rounds as RTX3090 only has 24GB memory. I also try to use args "--mem_efficient true" or change the embedding size to 512, but they are not very useful.
Is there any way to run PLATO-XL on 3 or more GPUs instead of 2? I notice that when setting cuda visible device to 3 in config file(e.g., interact.conf), the script will split the checkpoint to 3. However, I got this error while running interact.sh. Is there any way to solve it?
Looking forward for your reply.