marella / ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.
MIT License
1.75k stars 137 forks source link

jupyter notebook crashed when importing llama2 models #59

Open littlehifive opened 11 months ago

littlehifive commented 11 months ago

Hi, I tried the following code, but my kernel crashed and restarted, let me know how I should fix this, thanks! :

from ctransformers import AutoModelForCausalLM

llm = AutoModelForCausalLM.from_pretrained("TheBloke/Llama-2-13B-chat-GGML", model_type="llama")

print(llm("AI is going to"))
Screen Shot 2023-07-22 at 10 41 19 AM
marella commented 11 months ago

Hi, it will be hard to debug the error in notebook. Can you try running the same code in a normal Python script and share the output.

narfian commented 10 months ago

I have same trouble. It's tested on GCP

image

image

Without the gpu_layers(Only CPU), it works well.

narfian commented 10 months ago

As a result of some experiments, I confirmed that it did not work on the Tesla L4 machine, but that inference was possible on the Tesla T4 machine in the same environment. It seems to be a support issue on the L4 machine side.