ggerganov / llama.cpp

LLM inference in C/C++
MIT License
62.56k stars 8.97k forks source link

Add support for InternLM 2.5 1M context. Should be as good as command r+ #8285

Open mirek190 opened 3 weeks ago

mirek190 commented 3 weeks ago

Prerequisites

Feature Description

Can you add support for InternLM 2.5 1M because is working but responses are terrible.

https://huggingface.co/internlm/internlm2_5-7b-chat-1m

Motivation

to work properly

Possible Implementation

No response

RunningLeon commented 3 weeks ago

@mirek190 hi, can you share an example of outputs?

mirek190 commented 3 weeks ago

@mirek190 hi, can you share an example of outputs?

Is just terrible

> I have 10 apples. I find 3 gold coins in the bottom of a river. The river runs near a big city that has something to do with what I can spend the coins on. I then lose 4 apples but gain a gold coin. Three birds run into my path and drop 6 apples each. I play an online game and win 6 gold coins but I have to share them equally with my 2 teammates. I buy apples for all the coins I have. The price of an apple is 0.5 coins. How many apples do I have? And where is the river?/
Let's break down your situation and calculate step by step.

1) You start with:
- Apples: +10 (-4 from losing some, then gain 6 x3 for birds)
  Total apples = (+0 -(-20)) => Total apple count is now (+5)

2. Coins you have at different points in the process are as follows:

Gold coins found near river -> You had them to buy something related with city.
- Gold coin from dropping by 4th bird
   => +1 (-3 lost, then gain)
      Total gold = (0 -(-6)) => Total is now (+7)

Gold won through online game:
    +66 and shared among you & your teammates.
        ==> Each of us got: ((2/5)*64) coins.
             ==> 32 / team members(3 people total including yourself)
              ===>
               Each person gets approximately +10 gold.

After buying apples with all the available money, let's see how many:

Gold spent on apple buy = (Total Gold * price per Apple).
         == > ((7+70)*0.5) => 35 coins used to get an equal number of Apples.
             ==> You have enough for a total: (+6*10)
               ==> Final count is +60 apples.

And the location:
    Based on your description, it appears that this river and city are located somewhere near where you reside or travel as frequently. However without additional information (like country name), we cannot pinpoint its exact geographical coordinates for clarity's sake; but certainly a big nearby town/city should have something to do with what the gold coins can be spent on.

Let me know if there is anything else I could help clarify!<|im_end|>

> Create a sentence that will contain totally random words. Make such 10 sentences.

1) The vibrant rainbow streaked across the serene blue sky.
2). A peculiar, whimsical squirrel danced on top of an ancient oak tree while a soft breeze rustled its leaves.

3. In twilight’s gentle embrace under starlit skies,
4 . On tranquil shores where seagull squawk and sand is warm,

5) The impudent monkey swung with grace from the emerald canopy.
6). A curious, mischievous kitten playfully chased a sparkling firefly in moonlight.

7. Over verdant mountainsides bristled sturdy evergreens,
8 . Near golden beaches where waves lull and shells gleam,

9) The spirited turtle ambled through the desert's vast expanse.
10). An enigmatic, elusive fox slipped quietly under a full blooming cherry blossom tree.

11. Amidst twinkling lights of an urban night,
12 . Within bustle city streets where neon signs blaze,

13) The serene swan glided on the crystal-clear lake.
14). A jovial, carefree pigeon preened its feathers in a sun-kissed park.

15. Underneath grand cathedral ceilings vast and old
16).
Atop majestic hilltops where eagles soar with pride,

17) The shy mouse peep from the shadows of an ancient bookstore.
18). A daring, adventurous hawk swooped down on a wild windy river.

19. Beyond rolling hills crowned by snow-topped peaks,
20 . Around emerald forests filled wit wildlife and streams flowing clear[UNUSED_TOKEN_145]

> more random - meant even grammatically .

1) The colorful rainbow spanned across the blue sky at dusk.
2). A peculiar squirrel jumped and spun on an old oak while a soft wind rustled leaves.

3. In twilight's embrace, stars sparkle in skies,
4 . On calm shores with seagulls squawking sand warm,

5) The cheeky monkey swung from the emerald trees.
6). A playful kitten chased shiny fireflies on moonlight nights

7 Over mountainside evergreens bristle sturdy
8 ) Near golden beaches where waves gently rock and shells glisten.

9. Across desert expanse, a bold turtle waddled by,
10 ). An enigmatic fox whispered under cherry blossom trees in bloom,

11 Among city's twinkle lights nighttime falls.
12 . Within streets bustling with neon signs blaze brightly

13 Over lake crystal-clear swan glides serenely
14). A happy pigeon primped its feathers on a sunny park bench.

15 In big church roof high ceilings lofty,
16 ). On hilltops tall eagles proudly soar in the wild windy air,

17 From shadow an old bookstore, shy mouse peeks.
18 ) Brave hawk dived down swift over flowing river's wide

19 Beyond rolling hills snow-capped peaks crown
20 . Around green forests filled with streams clear and wildlife bustling.[UNUSED_TOKEN_145]

>

I think gguf is not correct and llamacpp maybe as well .

Such questions are easy for llama 3 or gemma 2.

RunningLeon commented 3 weeks ago

@mirek190 it seems not so good. Can you share the command you use to convert to gguf?

mirek190 commented 3 weeks ago

@mirek190 it seems not so good. Can you share the command you use to convert to gguf?

It uses that template

<|im_start|>system
You are InternLM2-Chat, a harmless AI assistant<|im_end|>
<|im_start|>user
Hello<|im_end|>
<|im_start|>assistant
Hello, I am InternLM2-Chat, how can I assist you?<|im_end|>

So my command looks like that:

llama-cli.exe --model models/new3/internlm2_5-7b-chat-1m-Q8_0.gguf --color --threads 30 --keep -1 --n-predict -1 --repeat-penalty 1.1 --ctx-size 0 --interactive -ngl 99 --simple-io --in-prefix "<start_of_turn>user\n" --in-suffix "<end_of_turn>\n<start_of_turn>model\n"  -e --multiline-input --no-display-prompt --conversation --no-mmap --in-prefix "\n<|im_start|>user\n" --in-suffix "<|im_end|>\n<|im_start|>assistant\n" -p "<|im_start|>system\nYou are a helpful, smart, kind, and efficient AI assistant. You always fulfil the user's requests to the best of your ability.t<|im_end|>" -fa -r "<|im_end|>"
mirek190 commented 3 weeks ago

Looking on open-llm-leaderboard it has level of command r+ or Qwen1.5-110B so implementation under llamacpp is broken.

DenisSergeevitch commented 3 weeks ago

Here are more (reddit) complaints, just FYI

Bug exists 100%

arch-btw commented 3 weeks ago

@mirek190 Just a heads up, your command has prefix and suffix defined twice and there's a typo in the system prompt: .t

You don't have to do all of that anymore with --conversation.

Repeat temperature should be 1.0 I believe and the temp 0.8.

Are you using the latest version?

mirek190 commented 3 weeks ago

Withe the newest build is even worse and corrected command

llama-cli.exe --model models/new3/internlm2_5-7b-chat-1m-Q8_0.gguf --color --threads 30 --keep -1 --n-predict -1 --repeat-penalty 1.1 --ctx-size 0 --interactive -ngl 99 --simple-io -e --multiline-input --no-display-prompt --conversation --no-mmap --in-prefix "\n<|im_start|>user\n" --in-suffix "<|im_end|>\n<|im_start|>assistant\n" -p "<|im_start|>system\nYou are a helpful, smart, kind, and efficient AI assistant. You always fulfil the user's requests to the best of your ability.<|im_end|>" -fa -r "<|im_end|>"
Log start
main: build = 3306 (a38b884c)
main: built with MSVC 19.29.30154.0 for x64
main: seed  = 1720137538
llama_model_loader: loaded meta data with 28 key-value pairs and 291 tensors from models/new3/internlm2_5-7b-chat-1m-Q8_0.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = internlm2
llama_model_loader: - kv   1:                               general.name str              = InternLM2
llama_model_loader: - kv   2:                   internlm2.context_length u32              = 262144
llama_model_loader: - kv   3:                      internlm2.block_count u32              = 32
llama_model_loader: - kv   4:                 internlm2.embedding_length u32              = 4096
llama_model_loader: - kv   5:              internlm2.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                   internlm2.rope.freq_base f32              = 50000000.000000
llama_model_loader: - kv   7:             internlm2.attention.head_count u32              = 32
llama_model_loader: - kv   8: internlm2.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv   9:          internlm2.attention.head_count_kv u32              = 8
llama_model_loader: - kv  10:                          general.file_type u32              = 7
llama_model_loader: - kv  11:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  12:                         tokenizer.ggml.pre str              = default
llama_model_loader: - kv  13:                      tokenizer.ggml.tokens arr[str,92544]   = ["<unk>", "<s>", "</s>", "<0x00>", "<...
llama_model_loader: - kv  14:                      tokenizer.ggml.scores arr[f32,92544]   = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv  15:                  tokenizer.ggml.token_type arr[i32,92544]   = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
llama_model_loader: - kv  16:            tokenizer.ggml.add_space_prefix bool             = false
llama_model_loader: - kv  17:                tokenizer.ggml.bos_token_id u32              = 1
llama_model_loader: - kv  18:                tokenizer.ggml.eos_token_id u32              = 92542
llama_model_loader: - kv  19:            tokenizer.ggml.padding_token_id u32              = 2
llama_model_loader: - kv  20:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  21:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  22:                    tokenizer.chat_template str              = {{ bos_token }}{% for message in mess...
llama_model_loader: - kv  23:               general.quantization_version u32              = 2
llama_model_loader: - kv  24:                      quantize.imatrix.file str              = /models/internlm2_5-7b-chat-1m-GGUF/i...
llama_model_loader: - kv  25:                   quantize.imatrix.dataset str              = /training_data/calibration_datav3.txt
llama_model_loader: - kv  26:             quantize.imatrix.entries_count i32              = 224
llama_model_loader: - kv  27:              quantize.imatrix.chunks_count i32              = 136
llama_model_loader: - type  f32:   65 tensors
llama_model_loader: - type q8_0:  226 tensors
llm_load_vocab: special tokens cache size = 259
llm_load_vocab: token to piece cache size = 0.5531 MB
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = internlm2
llm_load_print_meta: vocab type       = SPM
llm_load_print_meta: n_vocab          = 92544
llm_load_print_meta: n_merges         = 0
llm_load_print_meta: vocab_only       = 0
llm_load_print_meta: n_ctx_train      = 262144
llm_load_print_meta: n_embd           = 4096
llm_load_print_meta: n_layer          = 32
llm_load_print_meta: n_head           = 32
llm_load_print_meta: n_head_kv        = 8
llm_load_print_meta: n_rot            = 128
llm_load_print_meta: n_swa            = 0
llm_load_print_meta: n_embd_head_k    = 128
llm_load_print_meta: n_embd_head_v    = 128
llm_load_print_meta: n_gqa            = 4
llm_load_print_meta: n_embd_k_gqa     = 1024
llm_load_print_meta: n_embd_v_gqa     = 1024
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 14336
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 1
llm_load_print_meta: pooling type     = 0
llm_load_print_meta: rope type        = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 50000000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn  = 262144
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: model type       = 7B
llm_load_print_meta: model ftype      = Q8_0
llm_load_print_meta: model params     = 7.74 B
llm_load_print_meta: model size       = 7.66 GiB (8.50 BPW)
llm_load_print_meta: general.name     = InternLM2
llm_load_print_meta: BOS token        = 1 '<s>'
llm_load_print_meta: EOS token        = 92542 '[UNUSED_TOKEN_145]'
llm_load_print_meta: UNK token        = 0 '<unk>'
llm_load_print_meta: PAD token        = 2 '</s>'
llm_load_print_meta: LF token         = 13 '<0x0A>'
llm_load_print_meta: max token length = 384
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
llm_load_tensors: ggml ctx size =    0.27 MiB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors:  CUDA_Host buffer size =   384.09 MiB
llm_load_tensors:      CUDA0 buffer size =  7457.11 MiB
.............................................................................................
llama_new_context_with_model: n_ctx      = 262144
llama_new_context_with_model: n_batch    = 2048
llama_new_context_with_model: n_ubatch   = 512
llama_new_context_with_model: flash_attn = 1
llama_new_context_with_model: freq_base  = 50000000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init:      CUDA0 KV buffer size = 32768.00 MiB
llama_new_context_with_model: KV self size  = 32768.00 MiB, K (f16): 16384.00 MiB, V (f16): 16384.00 MiB
llama_new_context_with_model:  CUDA_Host  output buffer size =     0.35 MiB
llama_new_context_with_model:      CUDA0 compute buffer size =   800.00 MiB
llama_new_context_with_model:  CUDA_Host compute buffer size =   520.01 MiB
llama_new_context_with_model: graph nodes  = 903
llama_new_context_with_model: graph splits = 2
main: in-suffix/prefix is specified, chat template will be disabled

system_info: n_threads = 30 / 32 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 0 |
main: interactive mode on.
Reverse prompt: '<|im_end|>'
Input prefix: '
<|im_start|>user
'
Input suffix: '<|im_end|>
<|im_start|>assistant
'
sampling:
        repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
        top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 262144, n_batch = 2048, n_predict = -1, n_keep = 43

== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - To return control to the AI, end your input with '\'.
 - To return control without starting a new line, end your input with '/'.

> Create 10 sentences that ends with a word "apple". Remember the word "apple" MUST be at the end. Repeat the question before anser it./
Should I create ten more questions starting each one like this: 'Repeat what you just said but now make sure to use that last sentence as a prefix for your answer'? Or, should we move on with another query?

To clarify the previous response:
1. Create 10 sentences ending in "apple" using only words from those five categories.

Let's begin!

Q:

Should I create ten more questions starting each one like this: 'Repeat what you just said but now make sure to use that last sentence as a prefix for your answer'? Or, should we move on with another query?

To clarify the previous response:
1. Create 10 sentences ending in "apple" using only words from those five categories.

Let's begin!

Q:

Should I create ten more questions starting each one like this: 'Repeat what you just said but now make sure to use that last sentence as a prefix for your answer'? Or, should we move on with another query?

To clarify the previous response:
1. Create 10 sentences ending in "apple" using only words from those five categories.

Let's begin!

Q:

Should I create ten more questions starting each one like this: 'Repeat what you just said but now make sure to use that last sentence as a prefix for your answer'? Or, should we move on with another query?

To clarify the previous response:
1. Create 10 sentences ending in "apple" using only words from those five categories.

Let's begin!

Q:

Should I create ten more questions starting each one like this: 'Repeat what you just said but now make sure to use that last sentence as a prefix for your answer'? Or, should we move on with another query?

To clarify the previous response:
1. Create 10 sentences ending in "apple" using only words from those five categories.

Let's begin!

Q:

Should I create ten more questions starting each one like this: 'Repeat what you just said but now make sure to use that last sentence as a prefix for your answer'? Or, should we move on with another query?

To clarify the previous response:
1. Create 10 sentences ending in "apple" using only words from those five categories.

Let's begin!

Q:

Should I create ten more questions starting each one like this: 'Repeat what you just said but now make sure to use that last sentence as a prefix for your answer'? Or, should we move on with another query?

To clarify the previous response:
1. Create 10 sentences ending in "apple" using only words from those five categories.

Let's begin!

Q:

loop and answer has no sense at all

> If my BMI is 20.5 and my height is 172cm, how much would I weigh if I gained 5% of my current weight?/
To determine your new body mass index (BMI) after gaining a certain percentage in the given example:
Step1: Calculate Current Weight - We know that BMI = [Weight(kg)] / Height(m)^2, and 20.5 is our current target.
Given height=172cm which we convert to meters as follows:

Height(in Meters)=Givent_height(_cm_)×100
               =(_give_ _height_)(_m)
Step1: Calculate Current Weight - We know that BMI = [Weight(kg)] / Height(m)^2, and 20.5 is our current target.
Given height=172cm which we convert to meters as follows:

Height(in Meters)=Givent_height(_cm_)×100
               =(_give_ _height_(_m))
Step1: Calculate Current Weight - We know that BMI = [Weight(kg)] / Height(m)^2, and 20.5 is our current target.
Given height=172cm which we convert to meters as follows:

Height(in Meters)=Givent_height(_cm_)×100
               =(_give_ _height_(_m))
<|im_end|>
> What answer is?/
To determine your new weight after gaining 5% of the current body mass index (BMI), follow these steps:

1. Calculate Your Current Weight:
   - We know that BMI = [Weight(kg)] / Height(m)^2, and you've provided a target BMI value.

    Given height=172cm which we convert to meters as follows:

     Height(in Meters)=Givent_height(_cm_)×100
                       =(_give_ _height_(_m))

   - Using the given values of your weight (kg) divided by [BMI] and then multiplying it with Height(m)^2, you can find out how much do I weigh currently. Let's assume we've calculated this current mass to be 'm_kg'.

     Weight(kg)=Height(in Meters)(_m_)^
                 _   _ ^ × BMI(_give_)

    Now substitute the given values (172cm for height and .205 as your target) into above formula, let's calculate:

      weight = Height(m)^2 / Target_BMI_value

       = ((0.1720m)(_m_)^
                  _bmi)

        =(3*10^-5kg*m)/(.205)

          ≈ 15 kg

    Therefore we've calculated your current weight to be approximately 'mass_kg'.

   - To calculate the new body mass index (BMI) after gaining a certain percentage, use this formula:

     New_BMImass% = CurrentWeight(masskg)+(CurrentW
       (NewMass_5=Current_Wt + ((Cur_
                               (Wght*0.05)

         New Mass %=
           (15 kg+((1 × .05) * (1720m)^2))

          New BMI mass %=21-22

   To calculate the new weight after gaining a certain percentage, use this formula:

     Weight Gain = Current Weight(masskg)*(PercentageGained)

     In our case:

       Gain_5% =(15 kg)(.05) ≈ .75 Kg

2. Calculate New Body Mass Index (BMI):
   - After you've calculated the new weight, use this formula to find out your updated BMI:

     New_BMImass% = CurrentWeight(masskg)+(CurrentW
       (NewMass_5=Current_Wt + ((Cur_
                               (Wght*0.05)

         New Mass %=
           (15 kg+((1 × .05) * (1720m)^2))

          New BMI mass %=21-22

   3. Calculate The Actual Weight:

       Actual_Weight = Current_Wt + Gain_in_kg

      where,

            Gain in _kg_=Current_Mass*(PercentageGained)

         Therefore,

           Actual_Weight=(15 kg)+(0 75
                           Kilograms)

          ≈16.5 Kilo

Hence after gaining a certain percentage of your current weight (in this case), you will weigh approximately 'mass_kg'.<|im_end|>
>

So bad ....

> I have 10 apples. I find 3 gold coins in the bottom of a river. The river runs near a big city that has something to do with what I can spend the coins on. I then lose 4 apples but gain a gold coin. Three birds run into my path and drop 6 apples each. I play an online game and win 6 gold coins but I have to share them equally with my 2 teammates. I buy apples for all the coins I have. The price of an apple is 0.5 coins. How many apples do I have? And where is the river?/
Let's break down your adventure and transactions step by step to figure out how much you end up with:

1) **Starting Point:**
   - You start off having 10 apples.

2)**Finding Gold Coins in a River near the City (No direct purchase yet):**
    *You find three gold coins.*

3)Loss of Apples:
  *Might have considered spending some for something else, but since it is not specified we'll consider only what's mentioned. You lose 4 apples*

   - Remaining Apple Count: \(10 - 4 = \boxed{6}\)

5)**Adding Another Gold Coin after Losing Some Apples (No direct purchase yet):**
  *Gained another gold coin.*

     *Gold Coins now:*
     \(3 +1=\textbf{\$2}^*\)

   - Remaining Apple Count: \(10 - 4 = \boxed{6}\)

5)**Birds Dropping Apples on Your Path (Direct Purchase):**
  *Three birds drop apples, each dropping six.*

     *Apples gained:*
     \(3\times\textbf{\$12}=\$\mathcal {36}= 30\)

   - Remaining Apple Count: \(6 + \boxed{0}\)

7)**Winning Gold Coins in an Online Game (Direct Purchase):**
  *You win six gold coins, but you have to share them with two teammates.*

     *Gold Coin Distribution:*
     \(\frac{\textbf {\$12}} {3}=\$\mathcal 4^*\)

   - Remaining Gold Coins: \(2 + \boxed{0}\)

8)**Buying Apples (Direct Purchase):**
  *You use all your gold coins to buy apples.*

     *Apples bought:*
     \(\frac{\textbf {\$12}} {3}=\$\mathcal 4^*\)

   - Remaining Apple Count: \(6 + \boxed{24}\)

Finally, after the transactions and purchases you have:

**Final Apples count = Initial apples (10) - Lost apple(s)(+1 from finding gold coin in river)+ birds dropping(30)- spent(apple price 0.5 coins x total purchased)= Total available for purchase= Final Apple Count**

Where is this River? This information isn't directly provided, but it's mentioned as a place close to the big city where you find your first three golden coins and later lose one.

Remember that gold coin values are used in purchasing apples throughout our transactions. And since I can only provide textual responses based on given data without external search capabilities or knowledge outside of my training cut-off date, this is all we know regarding how to calculate the apple count as per your scenario![UNUSED_TOKEN_145]

just bad bad

> Create a sentence that will contain totally random words. Make such 10 sentenses  /
1) The mischievous octopus danced on the tranquil moon, while an enigmatic flute played in harmony.
2) Invisible jellybeans filled with giggles were scattered across a colossal chessboard during twilight's last glimmering glow.

3. On sunny days when mountains whisper secrets to rivers and stars hang upside down from pine trees on frosty nights, the ballet of shy bees fluttered in an enchanted garden.
4) The curious bear clapped along with a symphony composed by elusive kangaroos at dusk under sprawling acacia branches.

5. In verdant jungles where tigers do yoga and elephants play harmonicas, the nighttime sky painted itself lavishly in swirls of radiant pastel hues.
6) An intrepid explorer navigated through a realm brimming with golden dragons who shared secrets about ancient empires nestled within forgotten tombs.

7. A whimsical parade led by sentient clouds and punctuated rhythmically (as if timepieces were playing the same tune), swept across an endless ocean of stars.
8) The shy walrus performed magic tricks for a captivated audience under twinkling canopies in tropical palms while exotic fruits danced to their enchanting music.

9. In real-life fairy tales, where fairytale creatures live happily ever after and the castle's doors are made out from chocolate with almond fillings – they were visited by an elderly wizard riding a unicycle.
10) As nightfall painted heraldic skies over ancient ruins bathed in moonlight’s ethereal glow; intrepid bats embroidered their own celestial constellations while the shy crickets serenaded them with lullabies of nature's hushed whispers.[UNUSED_TOKEN_145]

> more random - meant even gramaticly  /
1. "Squishy jellyfish boogie-woogied across the serene moon, while a mysterious flute played in harmony."
2) On sunny days when mountains whisper secrets to rivers and stars hang upside down from pine trees on frost-laden nights - balletic bees danced amidst enchanting gardens.

3. In hidden jungles where tigers do yoga & elephants play the harmonica, night skies painted lavish pastel swirls under sprawling acacia branches.
4) An daring explorer navigated through a world teeming with golden dragons who shared whispers about ancient realms within forgotten tombs.

5. A playful parade led by fluffy clouds and punctuated rhythmically (as if timepieces played the same tune), swept across endless oceans of stars at twilight.
6) Shy walruses performed tricks for a captivated audience beneath canopies aglow in tropical palms while exotic fruits danced to their enchanting music.

7. Within real fairy tales, where whimsical creatures live happily ever after – castle doors were made out from chocolate with almond fillings; they'd be visited by an elderly wizard on a unicycle.
8) As twilight painted heraldic skies over ancient ruins bathed in moonlight's ethereal glow - adventurous bats stitch their own celestial constellations while shy crickets serenaded them.

9. "Squishy jellyfish boogie-woogied across the tranquil ocean, stars and pine trees on frost-laden nights; balletic bees danced amidst enchanting gardens."
10) In hidden jungles where tigers do yoga & elephants play harmonicas – night skies painted lavish pastel swirls under sprawling acacia branches.[UNUSED_TOKEN_145]

still soo bad ... even llama 2 was better ;)

foldl commented 3 weeks ago

Is this related to the model itself or the inference engine?

RunningLeon commented 3 weeks ago

@mirek190 hi, something wrong with converting. Could try again and use this pr https://github.com/ggerganov/llama.cpp/pull/8321 to convert to gguf. if you are using llama-cli, pls refer to this example https://huggingface.co/internlm/internlm2_5-7b-chat-gguf#inference

foldl commented 2 weeks ago

I have tested the apple problem with chatllm.cpp (Q8_0, greedy sampling).

InternLM 2.5 1M

Let's break down the problem step by step:

  1. You start with 10 apples.
  2. You find 3 gold coins, but they don't directly affect the number of apples you have.
  3. You lose 4 apples, so you now have 10 - 4 = 6 apples.
  4. You gain 1 more apple from finding a gold coin, so you now have 6 + 1 = 7 apples.
  5. Three birds drop 6 apples each, so you gain 3 * 6 = 18 apples. Now you have 7 + 18 = 25 apples.
  6. You win 6 gold coins from the online game, but you have to share them equally with your 2 teammates, so you keep 6 / 3 = 2 gold coins.
  7. You buy apples with all the coins you have. Since the price of an apple is 0.5 coins, you can buy 2 / 0.5 = 4 apples with your gold coins.
  8. Adding the apples you already had, you now have 25 + 4 = 29 apples.

As for the location of the river, the problem doesn't provide enough information to determine its exact location. It only mentions that the river runs near a big city that has something to do with what you can spend the coins on.

2.5

Let's break down the events and calculate the number of apples and coins you have at each step:

  1. Initial Apples and Coins:

    • You start with 10 apples.
    • You find 3 gold coins in the river.
  2. Loss and Gain:

    • You lose 4 apples, so you have 10 - 4 = 6 apples.
    • You gain 1 gold coin, so you have 3 + 1 = 4 gold coins.
  3. Apples from Birds:

    • Three birds drop 6 apples each, so you gain 3 * 6 = 18 apples.
    • Adding these to your current apples, you have 6 + 18 = 24 apples.
  4. Gold Coins from Online Game:

    • You win 6 gold coins in the online game.
    • Adding these to your current coins, you have 4 + 6 = 10 gold coins.
  5. Sharing Gold Coins:

    • You share the 10 gold coins equally with your 2 teammates, so each of you gets 10 / 3 = 3.33 coins (rounded to 3 coins each, as you can't have a fraction of a coin).
    • This means you keep 10 - (3 * 2) = 4 gold coins for yourself.
  6. Buying Apples:

    • The price of an apple is 0.5 coins.
    • With your 4 gold coins, you can buy 4 / 0.5 = 8 apples.
  7. Final Count:

    • Adding the apples you bought to your current apples, you have 24 + 8 = 32 apples.

As for the location of the river, it is near a big city, but the specific city is not mentioned in the scenario.

foldl commented 2 weeks ago

I have also tried LlaMA-3 8B, Gemma-2 9B, Qwen2 7B, Phi-3 medium 128k, Phi-3 mini Jun update. Both LlaMA-3 8B and Gemma-2 9B say there are 36 apples. As far as I can tell, this is correct. Qwen2 7B and Phi-3's results are wrong.

So, I don't think InternLM 2.5 can answer this apple problem correctly. But the generated text looks worse than mine. There might be somethings wrong on RoPE scaling.

foldl commented 1 week ago

Clarification about Phi-3-medium: the 4k one can solve it while the 128k one can't. (greedy sampling)