Open skyxiaobai opened 1 week ago
Same issue with Llama-3.1-70B-Instruct
Hey @skyxiaobai & @vtemplier, I'm Sorry about this issue. Can you please pull from the main again and run again? It should now return an error message indicating what's the issue.
Hi I tested gemma2b/9b/27b model: 2b model can work fine.
logs:
(app) ubuntu@ubuntu:/mnt/sda/app/hf-llm.rs$ cargo run --release -- -m "google/gemma-2-27b-it" -c
Finished release [optimized] target(s) in 0.04s
Running target/release/hf-llm -m google/gemma-2-27b-it -c
Starting chat mode. Type 'exit' to end the conversation.
You: hi
Error: "HTTP Error 403 Forbidden: {\"error\":\"The model google/gemma-2-27b-it is too large to be loaded automatically (54GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).\"}"
(app) ubuntu@ubuntu:/mnt/sda/app/hf-llm.rs$ cargo run --release -- -m "google/gemma-2-9b-it" -c
Finished release [optimized] target(s) in 0.03s
Running target/release/hf-llm -m google/gemma-2-9b-it -c
Starting chat mode. Type 'exit' to end the conversation.
You: hi
Error: "HTTP Error 403 Forbidden: {\"error\":\"The model google/gemma-2-9b-it is too large to be loaded automatically (18GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).\"}"
(app) ubuntu@ubuntu:/mnt/sda/app/hf-llm.rs$ cargo run --release -- -m "google/gemma-2-2b-it" -c
Finished release [optimized] target(s) in 0.03s
Running target/release/hf-llm -m google/gemma-2-2b-it -c
Starting chat mode. Type 'exit' to end the conversation.
You: hi
Hi! 👋
How can I help you today? 😄
You: who are you I am Gemma, a large language model created by the Gemma team at Google DeepMind.
Hi @Vaibhavs10
First of all, many thanks for you script. This is just great!
I did install your script on Windows 11.
Here are what is working and what is not working (my main problem is with meta-llama/Meta-Llama-3.1-8B-Instruct
and meta-llama/Meta-Llama-3.1-70B-Instruct
models).
google/gemma-2-2b-it
: it works well!C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "google/gemma-2-2b-it" -c
Finished `release` profile [optimized] target(s) in 0.26s
Running `target\release\hf-llm.exe -m google/gemma-2-2b-it -c`
Starting chat mode. Type 'exit' to end the conversation.
You: Hi
Hi! 👋 How can I help you today? 😄
You: Who are you?
I am Gemma, an created by the Gemma team. I'm here to help with various tasks like answering your questions, providing information, and even starting a fun conversation. 😊
What can I do for you today?
google/gemma-2-27b-it
and google/gemma-2-9b-it
: it is too large to be loaded automatically (xxGB > 10GB)How to download these 2 models in the Windows Terminal?
C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "google/gemma-2-27b-it" -c
Finished `release` profile [optimized] target(s) in 0.32s
Running `target\release\hf-llm.exe -m google/gemma-2-27b-it -c`
Starting chat mode. Type 'exit' to end the conversation.
You: Hi
Error: "HTTP Error 403 Forbidden: {\"error\":\"The model google/gemma-2-27b-it is too large to be loaded automatically (54GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).\"}"
error: process didn't exit successfully: `target\release\hf-llm.exe -m google/gemma-2-27b-it -c` (exit code: 1)
meta-llama/Meta-Llama-3.1-8B-Instruct"
and meta-llama/Meta-Llama-3.1-70B-Instruct"
: HTTP Error 400 Bad Request:I did run huggingface-cli login
before and did enter an existing HF Access Token but the script still request this token (see error message). Can you help to solve this issue? Thank you.
C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "meta-llama/Meta-Llama-3.1-8B-Instruct" -c
Finished `release` profile [optimized] target(s) in 0.28s
Running `target\release\hf-llm.exe -m meta-llama/Meta-Llama-3.1-8B-Instruct -c`
Starting chat mode. Type 'exit' to end the conversation.
You: Hi
Error: "HTTP Error 400 Bad Request: {\"error\":\"Model requires a Pro subscription; check out hf.co/pricing to learn more. Make sure to include your HF token in your query.\"}"
error: process didn't exit successfully: `target\release\hf-llm.exe -m meta-llama/Meta-Llama-3.1-8B-Instruct -c` (exit code: 1)
Hey @piegu @skyxiaobai - this should work now.
Feel free to close if it works now! 🤗
Hi @Vaibhavs10.
Thank you but it does not work with meta-llama/Meta-Llama-3.1-70B-Instruct
.
Error message: "Model requires a Pro subscription" (?).
See below the commands I ran in a Windows Terminal (by cmd).
Thank you in advance for your help.
Microsoft Windows [versão 10.0.22631.4037]
(c) Microsoft Corporation. Todos os direitos reservados.
C:\Users\Pierre>cd hf-llm.rs
C:\Users\Pierre\hf-llm.rs> git pull
remote: Enumerating objects: 8, done.
remote: Counting objects: 100% (8/8), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 5 (delta 2), reused 4 (delta 2), pack-reused 0 (from 0)
Unpacking objects: 100% (5/5), 1.49 KiB | 15.00 KiB/s, done.
From https://github.com/vaibhavs10/hf-llm.rs
025394b..e4925ce main -> origin/main
Updating 6f8b379..e4925ce
Fast-forward
src/main.rs | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
C:\Users\Pierre\hf-llm.rs>huggingface-cli login
_| _| _| _| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _|_|_|_| _|_| _|_|_| _|_|_|_|
_| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|
_|_|_|_| _| _| _| _|_| _| _|_| _| _| _| _| _| _|_| _|_|_| _|_|_|_| _| _|_|_|
_| _| _| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|
_| _| _|_| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _| _| _| _|_|_| _|_|_|_|
A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
Setting a new token will erase the existing one.
To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token can be pasted using 'Right-Click'.
Enter your token (input will not be visible):
Add token as git credential? (Y/n) Y
Token is valid (permission: write).
Your token has been saved in your configured git credential helpers (manager).
Your token has been saved to C:\Users\Pierre\.cache\huggingface\token
Login successful
C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "meta-llama/Meta-Llama-3.1-70B-Instruct" -p "How to make a dangerously spicy ramen?"
Finished `release` profile [optimized] target(s) in 0.22s
Running `target\release\hf-llm.exe -m meta-llama/Meta-Llama-3.1-70B-Instruct -p "How to make a dangerously spicy ramen?"`
Error: "HTTP Error 400 Bad Request: {\"error\":\"Model requires a Pro subscription; check out hf.co/pricing to learn more. Make sure to include your HF token in your query.\"}"
error: process didn't exit successfully: `target\release\hf-llm.exe -m meta-llama/Meta-Llama-3.1-70B-Instruct -p "How to make a dangerously spicy ramen?"` (exit code: 1)
C:\Users\Pierre\hf-llm.rs>
Yes! the error message is correct - it is a Pro only model, so you'd require a Pro
subscription for it: https://huggingface.co/pricing#pro
Hi @Vaibhavs10
Yes! the error message is correct - it is a Pro only model, so you'd require a
Pro
subscription for it: https://huggingface.co/pricing#pro
Oups! You're right: it is written at the top of your README.MD... but that is a true disappointment (bye, bye Llama 3.1 xxb). :-(
With google/gemma-2-9b-it and google/gemma-2-27b-it, I have no error message but the model returns nothing (see code below). How to solve this issue?
Microsoft Windows [versão 10.0.22631.4037]
(c) Microsoft Corporation. Todos os direitos reservados.
C:\Users\Pierre>cd hf-llm.rs
C:\Users\Pierre\hf-llm.rs>huggingface-cli login
_| _| _| _| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _|_|_|_| _|_| _|_|_| _|_|_|_|
_| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|
_|_|_|_| _| _| _| _|_| _| _|_| _| _| _| _| _| _|_| _|_|_| _|_|_|_| _| _|_|_|
_| _| _| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _|
_| _| _|_| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _| _| _| _|_|_| _|_|_|_|
A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
Setting a new token will erase the existing one.
To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token can be pasted using 'Right-Click'.
Enter your token (input will not be visible):
Add token as git credential? (Y/n) Y
Token is valid (permission: write).
Your token has been saved in your configured git credential helpers (manager).
Your token has been saved to C:\Users\Pierre\.cache\huggingface\token
Login successful
C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "google/gemma-2-9b-it" -c
Finished `release` profile [optimized] target(s) in 5.52s
Running `target\release\hf-llm.exe -m google/gemma-2-9b-it -c`
Starting chat mode. Type 'exit' to end the conversation.
You: bonjour
You: hi
You: nothing?
You: exit
C:\Users\Pierre\hf-llm.rs>
Can you share a list of other free LLM generative AI models, other than gemma 2 models, that we can use with your code?
Thank you!
Hugginface hub login successful
Used gemma2-27b LLM to testing:
cargo run --release -- -m "google/gemma-2-27b-it" -c Finished release [optimized] target(s) in 0.03s Running
target/release/hf-llm -m google/gemma-2-27b-it -c
Starting chat mode. Type 'exit' to end the conversation. You: hiYou: hi
You: who are you