Open alucard001 opened 1 year ago
Same issue here.
I could run it on Google Colab Pro+ with High-memory and A100 GPU but it's as you see pretty slow:
> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Loaded in 401.99 seconds
I believe the meaning of life is
> to be happy. I believe we are all born with the potential to be happy. The meaning of life is to be happy, but the way to get there is not always easy.
The meaning of life is to be happy. It is not always easy to be happy, but it is possible. I believe that
==================================
Simply put, the theory of relativity states that
> 1) time, space, and mass are relative, and 2) the speed of light is constant, regardless of the relative motion of the observer.
Let’s look at the first point first.
Relative Time and Space
The theory of relativity is built on the idea that time and space are relative
==================================
A brief message congratulating the team on the launch:
Hi everyone,
I just
> wanted to say a big congratulations to the team on the launch of the new website.
I think it looks fantastic and I'm sure it'll be a huge success.
Please let me know if you need anything else from me.
Best,
==================================
Translate English to French:
sea otter => loutre de mer
peppermint => menthe poivrée
plush girafe => girafe peluche
cheese =>
> fromage
fish => poisson
giraffe => girafe
elephant => éléphant
cat => chat
giraffe => girafe
elephant => éléphant
cat => chat
giraffe => gira
==================================
Thanks. Would you mind share your Colab notebook and file structure? I think I did some config wrong and would like to know how do you set your configuration. Thank you.
It should help to use a sharded and quantised of the model such as: https://huggingface.co/Trelis/Llama-2-7b-chat-hf-sharded-bf16
There's a notebook there too for inference which includes quantisation.
Here is the Gist: https://gist.github.com/alucard001/ed115328a82865961d020d46387cfd47
As you can see, after installing Pytorch and run the example command, it runs for 3:30 and the child process is stopped.
GPU version is attached in Gist for reference.
Is it the memory problem? Or any other insight is appreciated.
Thank you very much in advance for FBR great work.