kennethleungty / Llama-2-Open-Source-LLM-CPU-Inference

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

https://towardsdatascience.com/running-llama-2-on-cpu-inference-for-document-q-a-3d636037a3d8

MIT License

947 stars 210 forks source link

can we use a gpu for increased speed and use of a bigger better llama2 model #7

Open stevedipaola opened 1 year ago

stevedipaola commented 1 year ago

is there instructs for that - most of us AI folk have good gpus - seems silly not to use them.

alior101 commented 1 year ago

two changes needed : ` def build_llm():

Local CTransformers model

llm = CTransformers(model=cfg.MODEL_BIN_PATH,
                    model_type=cfg.MODEL_TYPE,
                    config={'max_new_tokens': cfg.MAX_NEW_TOKENS,
                            'temperature': cfg.TEMPERATURE,
                            'gpu_layers': 24}
                    )
return llm

uninstall ctrasnformers and re-install with CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers

gabacode commented 1 year ago

two changes needed : def build_llm(): # Local CTransformers model llm = CTransformers(model=cfg.MODEL_BIN_PATH, model_type=cfg.MODEL_TYPE, config={'max_new_tokens': cfg.MAX_NEW_TOKENS, 'temperature': cfg.TEMPERATURE, 'gpu_layers': 24} ) return llm

uninstall ctrasnformers and re-install with CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers

Thanks @alior101, it worked! How would it integrate with poetry and pyproject.toml?

alexng88 commented 9 months ago

Hello, I made the changes of llm.py

llm = CTransformers(model=cfg.MODEL_BIN_PATH, model_type=cfg.MODEL_TYPE, config={'max_new_tokens': cfg.MAX_NEW_TOKENS, 'temperature': cfg.TEMPERATURE, 'gpu_layers':24}

and reinstall ctransformers (0.2.27)

but seems still running very slow seems never use GPU
I already tested GPU is ready python -c "import torch; print(torch.cuda.is_available())" it said "True"

then I tried "python main.py "hello"" it takes 300+s to answer please advise

VIGHNESH1521 commented 5 months ago

Hi @alexng88 ,

I am also facing the same issues, did you get any solutions?