huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
15.16k stars 887 forks source link

How to run the quantized Solar model? #1624

Open 555cider opened 7 months ago

555cider commented 7 months ago

I am trying to run the Solar model, but I am constantly failing. Here are my attempts:

  1. [quantized] example (modified) with the Quantized Solar model (local) : Failed. It only outputs nonsense that is unrelated to the question.
  2. [llama] example with the Quantized Solar model (local) : Failed. The process was Killed. Either because of ①a "Quantized" model or ②a low-spec PC (16GB of RAM, etc.).
  3. [llama] example with the Solar model : Failed. The process was Killed. The most likely cause is ①a low-spec PC.
  4. oobabooga with the Quantized Solar model (local) : Success. Confirmed that my PC can run the Quantized Solar model.
  5. oobabooga with the Solar model : Failed. The process was Killed. Confirmed that my PC cannot run the Solar model.

Conclusion: Is there any way to run the Quantized Solar model? I know I only wrote about 5 attempts, but I actually tried several different variations of the code in step 1. I also downloaded the model several times in my poor internet speed.

ealmloff commented 7 months ago

[quantized] example (modified) with the Quantized Solar model (local): Failed. It only outputs nonsense that is unrelated to the question.

If you are getting incoherent output, you might have the wrong tokenizer set. I have used the solar models with the quantized implementation of llama in candle with these settings:

model id: TheBloke/SOLAR-10.7B-v1.0-GGUF revision: main gguf file within the repo: solar-10.7b-v1.0.Q4_K_M.gguf tokenizer repo: upstage/SOLAR-10.7B-v1.0 tokenizer file: tokenizer.json

With the wrong tokenizer, you can get some giberish output. Here is the output I got when trying to use the solar with the llama tokenizer:

капи кате cheap versree слоrio UnjetQL voiceseg listGE Here Jas SozialExt prod arr press віціled solemAnchor fields ár_+PLLouis searchedQu profileslickiedasterund v damalsisko timing rings authorizationтельной pochodontql tableView'];equationroom

The code to load the solar model along which results in much more coherent output I used the model through a slightly higher level interface to candle transformers I am working on called [Kalosm](https://github.com/floneum/floneum/tree/c0b4de1227f691f8c53c632ca3ed5f2e1dd3257a/interfaces/kalosm), but you could also use the tokenizer and model I listed above with the candle quantized example directly: ```rust use kalosm::language::*; use std::io::Write; #[tokio::main] async fn main() { let model = Llama::builder() .with_source(LlamaSource::solar_10_7b()) .build() .unwrap(); let prompt = "# About Machine Learning\n"; let mut result = model .stream_text(prompt) .with_max_length(1000) .await .unwrap(); print!("{prompt}"); while let Some(token) = result.next().await { print!("{token}"); std::io::stdout().flush().unwrap(); } } ``` Output: ```rust # About Machine Learning This is a list of resources that I use to learn about machine learning. The topic covers both the theory as well as the implementation in practice, using Python and R mostly (but also some other tools such as Matlab or Julia). This page will be updated regularly with new content so check back often! If you have any suggestions for additions please let me know via email at [email protected] # Machine Learning Books The following books are great resources to learn about machine learning. They cover both the theory and implementation in practice, using Python or R mostly (but also some other tools such as Matlab). This page will be updated regularly with new content so check back often! If you have any suggestions for additions please let me know via email at [email protected] # Machine Learning Papers The following papers are great resources to learn about machine learning. They cover both the theory and implementation in practice, using Python or R mostly (but also some other tools such as Matlab). This page will be updated regularly with new content so check back often! If you have any suggestions for additions please let me know via email at [email protected] # Machine Learning Blogs The following blogs are great resources to learn about machine learning. They cover both the theory and implementation in practice, using Python or R mostly (but also some other tools such as Matlab). This page will be updated regularly with new content so check back often! If you have any suggestions for additions please let me know via email at [email protected] ```