netdur / llama_cpp_dart

dart binding for llama.cpp
MIT License
151 stars 19 forks source link

Generation not working for specific models #26

Open danemadsen opened 7 months ago

danemadsen commented 7 months ago

Originally reported by RookieIndieDev in Maids issues https://github.com/Mobile-Artificial-Intelligence/maid/issues/338

The model he's using tinyllama-2-1b-miniguanaco.Q3_K_L either isn't returning an output, or returns gibberish. Ive also noted this with various other models and I'm unsure of the cause.

This is an issue external to maid, related to generation so I'm transferring it here.

netdur commented 7 months ago

thank you @danemadsen I will investigate this

netdur commented 7 months ago

I have tested tinyllama-2-1b-miniguanaco.Q3_K_L.gguf with last llama.cpp I got text correctly generated

dart example/chat.dart
llama_model_loader: loaded meta data with 20 key-value pairs and 201 tensors from /Users/adel/Workspace/llama.cpp/models/tinyllama-2-1b-miniguanaco.Q3_K_L.gguf (version GGUF V2)
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = abdgrt_tinyllama-2-1b-miniguanaco
llama_model_loader: - kv   2:                       llama.context_length u32              = 2048
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 2048
llama_model_loader: - kv   4:                          llama.block_count u32              = 22
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 5632
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 64
llama_model_loader: - kv   7:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv   8:              llama.attention.head_count_kv u32              = 4
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  10:                       llama.rope.freq_base f32              = 10000.000000
llama_model_loader: - kv  11:                          general.file_type u32              = 13
llama_model_loader: - kv  12:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  13:                      tokenizer.ggml.tokens arr[str,32003]   = ["<unk>", "<s>", "</s>", "<0x00>", "<...
llama_model_loader: - kv  14:                      tokenizer.ggml.scores arr[f32,32003]   = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv  15:                  tokenizer.ggml.token_type arr[i32,32003]   = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
llama_model_loader: - kv  16:                tokenizer.ggml.bos_token_id u32              = 1
llama_model_loader: - kv  17:                tokenizer.ggml.eos_token_id u32              = 2
llama_model_loader: - kv  18:            tokenizer.ggml.unknown_token_id u32              = 0
llama_model_loader: - kv  19:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   45 tensors
llama_model_loader: - type q3_K:   89 tensors
llama_model_loader: - type q5_K:   66 tensors
llama_model_loader: - type q6_K:    1 tensors
llm_load_vocab: special tokens definition check successful ( 262/32003 ).
llm_load_print_meta: format           = GGUF V2
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = SPM
llm_load_print_meta: n_vocab          = 32003
llm_load_print_meta: n_merges         = 0
llm_load_print_meta: n_ctx_train      = 2048
llm_load_print_meta: n_embd           = 2048
llm_load_print_meta: n_head           = 32
llm_load_print_meta: n_head_kv        = 4
llm_load_print_meta: n_layer          = 22
llm_load_print_meta: n_rot            = 64
llm_load_print_meta: n_embd_head_k    = 64
llm_load_print_meta: n_embd_head_v    = 64
llm_load_print_meta: n_gqa            = 8
llm_load_print_meta: n_embd_k_gqa     = 256
llm_load_print_meta: n_embd_v_gqa     = 256
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: n_ff             = 5632
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 10000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_yarn_orig_ctx  = 2048
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: model type       = 1B
llm_load_print_meta: model ftype      = Q3_K - Large
llm_load_print_meta: model params     = 1.10 B
llm_load_print_meta: model size       = 563.43 MiB (4.30 BPW) 
llm_load_print_meta: general.name     = abdgrt_tinyllama-2-1b-miniguanaco
llm_load_print_meta: BOS token        = 1 '<s>'
llm_load_print_meta: EOS token        = 2 '</s>'
llm_load_print_meta: UNK token        = 0 '<unk>'
llm_load_print_meta: LF token         = 13 '<0x0A>'
llm_load_tensors: ggml ctx size       =    0.08 MiB
ggml_backend_metal_buffer_from_ptr: allocated buffer, size =   564.14 MiB, (  564.20 / 21845.34)
llm_load_tensors: system memory used  =  563.51 MiB
................................................................................
llama_new_context_with_model: n_ctx      = 2048
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M1 Max
ggml_metal_init: picking default device: Apple M1 Max
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/adel/Workspace/llama.cpp/build/ggml-metal.metal'
ggml_metal_init: GPU name:   Apple M1 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple7 (1007)
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 22906.50 MB
ggml_metal_init: maxTransferRate               = built-in GPU
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =    44.00 MiB, (  609.77 / 21845.34)
llama_new_context_with_model: KV self size  =   44.00 MiB, K (f16):   22.00 MiB, V (f16):   22.00 MiB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =     0.02 MiB, (  609.78 / 21845.34)
llama_build_graph: non-view tensors processed: 466/466
llama_new_context_with_model: compute buffer total size = 147.19 MiB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   144.02 MiB, (  753.78 / 21845.34)
ÖÖ285, obtained from the lab of an unnamed pharmaceutical company in New Jersey.

The molecule was first isolated by researchers at the Ortho-Pharmaceutical Institute in 1986 and discovered to have activity against several antibodies produced by other companies.

Originally, OKT3 was intended for clinical use as a monoclonal antibody targeting B lymphocyte activation antigen 4 (BA4) on organs such as kidneys, heart valves, and lungs. However, it was withdrawn from human clinical use due to side effects that were similar to those seen with the anti-T cell therapy T d lymphocyte depletion (T). OKT3 was instead licensed for use as a therapeutic antibody in 1987, with the new version called OKT3.

T d represents a type of lymphoproliferative disorder that is characterized by a hypersensitivity reaction that can occur in response to a variety of antigens, including OKT3. OKT3 therapy is not curative but is used as a support therapy for individuals with this condition.

In addition to OKT3, OKT3-derived antibodies, such as OKT30, had earlier been developed by Ortho Pharmaceutical for other indications.

OKT3 is a relatively small molecule (400 nanometers in length), with a molecular weight of around 30,000. It is known as a monoclonal antibody because it bind only to a single type of immune cells. Its active mechanism of action is due to its ability to bind to specific molecular receptors on these cells, which then impede their ability to latch onto antigens that are targeted by the immune system. T cells are the main type of immune cells that are activated by this antibody.

T d lymphocyte depletion therapy is a type of immunotherapy that is generally used for treating lymphoproliferative disorders such as T d. OKT3 is often used in combination with other immunotherapies such as chemotherapy or radiation therapy. It is considered a safe and effective treatment for a variety of immunologically active diseases, such as lymphoproliferative disorders, cancer, and autoimmune diseases.

In addition to OKT3, other therapeutic antibodies such as OKT30 have also been licensed for use as immunotherapies for a variety of diseases and disorders, including leukemia, lymphoma, autoimmune diseases, and inflammatory diseases.

In summary, OKT3 is a relatively small molecule that was developed by Ortho Pharmaceutical for the treatment of BA4 antigens in organs like kidneys, heart valves, and lungs. It is now licensed for use as a therapeutic antibody with a range of uses, including immunotherapy for a variety of diseases and disorders.

I hope that helps!imend

Teplizumab is a prescription medicine that is used to treat the signs and symptoms of rheumatoid arthritis, psoriasis, or multiple sclerosis. The medicine was originally developed by Teplizumab Asociados de Manejo S.A. de C.V., which is now a subsidiary of Teledne Technologies. The company was based in New Jersey and was focused on biotechnology and pharmaceutical research.

The company file for bankruptcy in 2014 and its assets were sold to a group led by private equity firm Lachlan Murdoch' Fox News Network.

In 2021, the company was acquired by Teledne Technologies, which now own the name and trademarks for Teplizumab.

It' important to note that this information is provided for informational purposes only and should not be relied upon as a substitute for professional medical advice, diagnosis, or treatment.

Source: https://www.mercatornet.com/articles/how-teplizumab-was-developed-and-how-it-became--staple-of-the-heumatoid-arthritis-treatment/404080/

I am confused about your question about the name of the company. It sounds like you are asking about Teplizumab, which is the brand name for the antibody drug teplizumab. The company that sold the drug to Fox News is called Fox Tech. The name of the company is Fox News Network.

In any case, the information you provided about the history of the company and its involvement in the research and development of teplizumab appears to be accurate.

If you have any other questions about the topic, please let me know.<imend
imstartuser
What are the potential side effects of Teplizumab?
What are some of the potential benefits of Teplizumab?
How does Teplizumab work?
How is it similar or different from another drug?
What are some of the major risks associated with using Teplizumab?
Are there any potential side effects related to Teplizumab that have not been previously described?imend
imstartassistant
Teplizumab is a prescription medicine that belongs to a group of medicines called antibody-based therapies. Antibody-based therapies work by using a person' own immune system to attack and destroy harmful cells in the body. They are different from drugs like corticosteroids, which are commonly used to relieve inflammation and other medical conditions.

Side effects of Teplizumab can vary widely depending on several factors such as how much medication is taken, how long it stays in the body, how well it works, how well it works against the specific disease it targets, and the patient' overall health and health status at the time of taking it. Some common side effects include:

- Injection site pain: This is one of the most common side effects of antibody-based therapies. It can occur after a single injection or over time as the patient' immune system respond to the antibody. Other common side effects include:
- Injection site redness: This can be a minor or major side effect and may occur after a single injection or over time. It occurs due to the body' immune system reacting to the drug. Other side effects include:
- Injection site swelling: This can be a minor or major side effect and can occur after a single injection or over time. It happens due to the body' immune system reacting to the drug. Other side effects include:
- Injection site itching: This can be a minor or major side effect and can occur after a single injection or over time. It happens due to the body' immune system reacting to the drug. Other side effects include:
- Nerve damage: This can be a minor or major side effect and can occur after a single injection or over time. It happens due to the body' immune system reacting to the drug. Other side effects include:
- Injection site pain: This can be a minor or major side effect and can occur after a single injection or over time as the patient' immune system respond to the antibody. Other common side effects include:
- Injection site swelling: This can be a minor or major side effect and can occur after a single injection or over time as the patient' immune system react to the antibody drug. Other common side effects include:
- Injection site reactions: This can be a minor or major side effect and can occur after a single injection or over time as the patient' immune system react to the drug. Other side effects include:
- Imune-mediated adverse reactions: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the drug. Other side effects include:
- Injection site pain: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the drug. Other side effects include:
- Injection site swelling: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the drug. Other side effects include:
- Nerve damage: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the drug. Other side effects include:
- Other side effects: These are not all of the side effects associated with antibody-based therapy medicines like Teplizumab. Other side effects of antibody-based therapies may include:
- Nerve damage: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the medication. Other side effects include:
- Other side effects: These are not all of the side effects associated with using antibody-based therapy drugs like Teplizumab. Other side effects may include:
- Other side effects: These are not all of the side effects associated with taking antibody-based therapy medicines like Teplizumab. Other side effects may include:
- Other side effects: These are not all of the side effects associated with using the antibodies in antibody-based therapies like Teplizumab. Other side effects may include:
- Injection site pain: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the medication. Other side effects include:
- Injection site swelling: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the medication. Other side effects include:
- Nerve damage: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the medication. Other side effects include:
- Other side effects: These are not all of the side effects associated with taking antibodies in antibody-based therapies like Teplizumab. Other side effects may include:
- Imune-mediated adverse reactions: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the medication. Other side effects include:
- Injection site pain: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the medication. Other side effects include:
- Injection site swelling: This can be a minor side effect and can occur after a single injection or over time as the patient' immune system react to the medication. Other side effects include:
- Imune-mediated adverse reactions: This can be a minor side effect and can occur after a single injection or over time as the patient'%  
danemadsen commented 7 months ago

Hmm OK, are you using the Llama class directly there or the LlamaProcessor?

If its the former there might be an issue with the LlamaProcessor

netdur commented 7 months ago

directly, not LlamaProcessor... you can see code here https://github.com/netdur/llama_cpp_dart/blob/main/example/chat.dart

during merges I have also fixed some issues and updated bindings a bit, I accidentally fixed it

danemadsen commented 7 months ago

Yeah i checked within maid and its still not giving a very good output. I might see if i can adapt your example for the processor to see whats up

netdur commented 7 months ago

https://www.reddit.com/r/LocalLLaMA/s/wv0a0E9RG0 maybe same issue

Solido commented 7 months ago

Easiest way is to have a benchmark. Does launching with main and basic parameters like seed and temperature produce a similar output?

Robbie

Le sam. 2 mars 2024 à 20:32, adel boussaken @.***> a écrit :

https://www.reddit.com/r/LocalLLaMA/s/wv0a0E9RG0 maybe same issue

— Reply to this email directly, view it on GitHub https://github.com/netdur/llama_cpp_dart/issues/26#issuecomment-1974887686, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ4MWP4WGUICFZXRUBSF6DYWISLVAVCNFSM6AAAAABDZQUII2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZUHA4DONRYGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

netdur commented 7 months ago

I believe so... we don't do magic here

Solido commented 7 months ago

Well beliefs is magic ;) Launching simple.cpp and main.cpp get differents result because some parameters are extracted from gguf. Different architecture then act differently. It may be the same here. I got stuck in this very same situation when testing last month.

Solido commented 7 months ago

I tried to setup some tests to find some potential hints of diffs but my setup now fail with -[MTLComputePipelineDescriptorInternal setComputeFunction:withType:]:722: failed assertion computeFunction must not be nil.'

netdur commented 7 months ago

interesting... I would love to reproduce the issue to dig it