Vaibhavs10 / hf-llm.rs

MIT License
189 stars 21 forks source link

Can't not get any LLM's feedback ? #3

Open skyxiaobai opened 1 week ago

skyxiaobai commented 1 week ago

Hugginface hub login successful

Used gemma2-27b LLM to testing:

cargo run --release -- -m "google/gemma-2-27b-it" -c Finished release [optimized] target(s) in 0.03s Running target/release/hf-llm -m google/gemma-2-27b-it -c Starting chat mode. Type 'exit' to end the conversation. You: hi

You: hi

You: who are you

vtemplier commented 1 week ago

Same issue with Llama-3.1-70B-Instruct

Vaibhavs10 commented 1 week ago

Hey @skyxiaobai & @vtemplier, I'm Sorry about this issue. Can you please pull from the main again and run again? It should now return an error message indicating what's the issue.

skyxiaobai commented 1 week ago

Hi I tested gemma2b/9b/27b model: 2b model can work fine.

logs: (app) ubuntu@ubuntu:/mnt/sda/app/hf-llm.rs$ cargo run --release -- -m "google/gemma-2-27b-it" -c Finished release [optimized] target(s) in 0.04s Running target/release/hf-llm -m google/gemma-2-27b-it -c Starting chat mode. Type 'exit' to end the conversation. You: hi Error: "HTTP Error 403 Forbidden: {\"error\":\"The model google/gemma-2-27b-it is too large to be loaded automatically (54GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).\"}" (app) ubuntu@ubuntu:/mnt/sda/app/hf-llm.rs$ cargo run --release -- -m "google/gemma-2-9b-it" -c Finished release [optimized] target(s) in 0.03s Running target/release/hf-llm -m google/gemma-2-9b-it -c Starting chat mode. Type 'exit' to end the conversation. You: hi Error: "HTTP Error 403 Forbidden: {\"error\":\"The model google/gemma-2-9b-it is too large to be loaded automatically (18GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).\"}" (app) ubuntu@ubuntu:/mnt/sda/app/hf-llm.rs$ cargo run --release -- -m "google/gemma-2-2b-it" -c Finished release [optimized] target(s) in 0.03s Running target/release/hf-llm -m google/gemma-2-2b-it -c Starting chat mode. Type 'exit' to end the conversation. You: hi Hi! 👋

How can I help you today? 😄

You: who are you I am Gemma, a large language model created by the Gemma team at Google DeepMind.

piegu commented 1 week ago

Hi @Vaibhavs10

First of all, many thanks for you script. This is just great!

I did install your script on Windows 11.

Here are what is working and what is not working (my main problem is with meta-llama/Meta-Llama-3.1-8B-Instruct and meta-llama/Meta-Llama-3.1-70B-Instruct models).

  1. With google/gemma-2-2b-it: it works well!
C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "google/gemma-2-2b-it" -c
    Finished `release` profile [optimized] target(s) in 0.26s
    Running `target\release\hf-llm.exe -m google/gemma-2-2b-it -c`
Starting chat mode. Type 'exit' to end the conversation.
You: Hi
Hi! 👋  How can I help you today? 😄

You: Who are you?
I am Gemma, an created by the Gemma team.  I'm here to help with various tasks like answering your questions, providing information, and even starting a fun conversation. 😊

What can I do for you today?
  1. With google/gemma-2-27b-it and google/gemma-2-9b-it: it is too large to be loaded automatically (xxGB > 10GB)

How to download these 2 models in the Windows Terminal?

C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "google/gemma-2-27b-it" -c
    Finished `release` profile [optimized] target(s) in 0.32s
     Running `target\release\hf-llm.exe -m google/gemma-2-27b-it -c`
Starting chat mode. Type 'exit' to end the conversation.
You: Hi
Error: "HTTP Error 403 Forbidden: {\"error\":\"The model google/gemma-2-27b-it is too large to be loaded automatically (54GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).\"}"
error: process didn't exit successfully: `target\release\hf-llm.exe -m google/gemma-2-27b-it -c` (exit code: 1)
  1. With meta-llama/Meta-Llama-3.1-8B-Instruct" and meta-llama/Meta-Llama-3.1-70B-Instruct": HTTP Error 400 Bad Request:

I did run huggingface-cli login before and did enter an existing HF Access Token but the script still request this token (see error message). Can you help to solve this issue? Thank you.

C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "meta-llama/Meta-Llama-3.1-8B-Instruct" -c
    Finished `release` profile [optimized] target(s) in 0.28s
     Running `target\release\hf-llm.exe -m meta-llama/Meta-Llama-3.1-8B-Instruct -c`
Starting chat mode. Type 'exit' to end the conversation.
You: Hi
Error: "HTTP Error 400 Bad Request: {\"error\":\"Model requires a Pro subscription; check out hf.co/pricing to learn more. Make sure to include your HF token in your query.\"}"
error: process didn't exit successfully: `target\release\hf-llm.exe -m meta-llama/Meta-Llama-3.1-8B-Instruct -c` (exit code: 1)
Vaibhavs10 commented 6 days ago

Hey @piegu @skyxiaobai - this should work now.

Feel free to close if it works now! 🤗

piegu commented 6 days ago

Hi @Vaibhavs10.

Thank you but it does not work with meta-llama/Meta-Llama-3.1-70B-Instruct. Error message: "Model requires a Pro subscription" (?).

See below the commands I ran in a Windows Terminal (by cmd).

Thank you in advance for your help.

Microsoft Windows [versão 10.0.22631.4037]
(c) Microsoft Corporation. Todos os direitos reservados.

C:\Users\Pierre>cd hf-llm.rs

C:\Users\Pierre\hf-llm.rs> git pull
remote: Enumerating objects: 8, done.
remote: Counting objects: 100% (8/8), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 5 (delta 2), reused 4 (delta 2), pack-reused 0 (from 0)
Unpacking objects: 100% (5/5), 1.49 KiB | 15.00 KiB/s, done.
From https://github.com/vaibhavs10/hf-llm.rs
   025394b..e4925ce  main       -> origin/main
Updating 6f8b379..e4925ce
Fast-forward
 src/main.rs | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

C:\Users\Pierre\hf-llm.rs>huggingface-cli login

    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
    Setting a new token will erase the existing one.
    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token can be pasted using 'Right-Click'.
Enter your token (input will not be visible):
Add token as git credential? (Y/n) Y
Token is valid (permission: write).
Your token has been saved in your configured git credential helpers (manager).
Your token has been saved to C:\Users\Pierre\.cache\huggingface\token
Login successful

C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "meta-llama/Meta-Llama-3.1-70B-Instruct" -p "How to make a dangerously spicy ramen?"
    Finished `release` profile [optimized] target(s) in 0.22s
     Running `target\release\hf-llm.exe -m meta-llama/Meta-Llama-3.1-70B-Instruct -p "How to make a dangerously spicy ramen?"`
Error: "HTTP Error 400 Bad Request: {\"error\":\"Model requires a Pro subscription; check out hf.co/pricing to learn more. Make sure to include your HF token in your query.\"}"
error: process didn't exit successfully: `target\release\hf-llm.exe -m meta-llama/Meta-Llama-3.1-70B-Instruct -p "How to make a dangerously spicy ramen?"` (exit code: 1)

C:\Users\Pierre\hf-llm.rs>
Vaibhavs10 commented 6 days ago

Yes! the error message is correct - it is a Pro only model, so you'd require a Pro subscription for it: https://huggingface.co/pricing#pro

piegu commented 6 days ago

Hi @Vaibhavs10

Yes! the error message is correct - it is a Pro only model, so you'd require a Pro subscription for it: https://huggingface.co/pricing#pro

About Llama 3.1 models

Oups! You're right: it is written at the top of your README.MD... but that is a true disappointment (bye, bye Llama 3.1 xxb). :-(

About gemma 2 models

With google/gemma-2-9b-it and google/gemma-2-27b-it, I have no error message but the model returns nothing (see code below). How to solve this issue?

Microsoft Windows [versão 10.0.22631.4037]
(c) Microsoft Corporation. Todos os direitos reservados.

C:\Users\Pierre>cd hf-llm.rs

C:\Users\Pierre\hf-llm.rs>huggingface-cli login

    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
    Setting a new token will erase the existing one.
    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token can be pasted using 'Right-Click'.
Enter your token (input will not be visible):
Add token as git credential? (Y/n) Y
Token is valid (permission: write).
Your token has been saved in your configured git credential helpers (manager).
Your token has been saved to C:\Users\Pierre\.cache\huggingface\token
Login successful

C:\Users\Pierre\hf-llm.rs>cargo run --release -- -m "google/gemma-2-9b-it" -c
    Finished `release` profile [optimized] target(s) in 5.52s
     Running `target\release\hf-llm.exe -m google/gemma-2-9b-it -c`
Starting chat mode. Type 'exit' to end the conversation.
You: bonjour

You: hi

You: nothing?

You: exit

C:\Users\Pierre\hf-llm.rs>

Last but not least

Can you share a list of other free LLM generative AI models, other than gemma 2 models, that we can use with your code?

Thank you!