Unknown error serving Llama 2 derivative model

brainlid commented 7 months ago

Using the latest versions:

nx: 0.7.0
exla: 0.7.0
bumblebee: 0.5.0

Loading the following Llama 2 serving fails. I acknowledge my code could be wrong.

The following code:

    # https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v3

    llama_2 = {:hf, "Trelis/Llama-2-7b-chat-hf-function-calling-v3"}

    {:ok, model_info} = Bumblebee.load_model(llama_2, type: :bf16, backend: EXLA.Backend)

    {:ok, tokenizer} = Bumblebee.load_tokenizer(llama_2)

    {:ok, generation_config} = Bumblebee.load_generation_config(llama_2)

    generation_config =
      Bumblebee.configure(generation_config,
        max_new_tokens: 1024,
        strategy: %{type: :multinomial_sampling, top_p: 0.6}
      )

    Bumblebee.Text.generation(model_info, tokenizer, generation_config,
      compile: [batch_size: 1, sequence_length: 4096],
      stream: true,
      # stream: false,
      defn_options: [compiler: EXLA, lazy_transfers: :never]
      # preallocate_params: true
    )

Results in the following error:

2024-02-23T22:45:31Z app[4d89d19eb99587] ord [info]** (FunctionClauseError) no function clause matching in Unpickler.load_op/3
2024-02-23T22:45:31Z app[4d89d19eb99587] ord [info]    (unpickler 0.1.0) lib/unpickler.ex:240: Unpickler.load_op(33, <<0, 0, 0, 0, 0, 0, 123, 34, 95, 95, 109, 101, 116, 97, 100, 97, 116, 97, 95, 95, 34, 58, 123, 34, 102, 111, 114, 109, 97, 116, 34, 58, 34, 112, 116, 34, 125, 44, 34, 108, 109, 95, 104, 101, 97, 100, 46, 119, 101, 105, ...>>, %{stack: [true], object_resolver: nil, persistent_id_resolver: nil, memo: %{}, metastack: [], refs: %{}})
2024-02-23T22:45:31Z app[4d89d19eb99587] ord [info]    (bumblebee 0.5.1) lib/bumblebee/conversion/pytorch/loader.ex:192: Bumblebee.Conversion.PyTorch.Loader.load_legacy!/1
2024-02-23T22:45:31Z app[4d89d19eb99587] ord [info]    (bumblebee 0.5.1) lib/bumblebee/conversion/pytorch.ex:48: anonymous fn/2 in Bumblebee.Conversion.PyTorch.load_params!/4
2024-02-23T22:45:31Z app[4d89d19eb99587] ord [info]    (elixir 1.15.7) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
2024-02-23T22:45:31Z app[4d89d19eb99587] ord [info]    (bumblebee 0.5.1) lib/bumblebee/conversion/pytorch.ex:47: anonymous fn/4 in Bumblebee.Conversion.PyTorch.load_params!/4
2024-02-23T22:45:31Z app[4d89d19eb99587] ord [info]    (nx 0.7.0) lib/nx.ex:4447: Nx.with_default_backend/2
2024-02-23T22:45:31Z app[4d89d19eb99587] ord [info]    (bumblebee 0.5.1) lib/bumblebee.ex:607: Bumblebee.load_params/5
2024-02-23T22:45:31Z app[4d89d19eb99587] ord [info]    (bumblebee 0.5.1) lib/bumblebee.ex:568: Bumblebee.load_model/2

The error is cryptic enough that I don't know what's wrong. It might be the config of the Llama 2 repo. I chose this repo because it avoids the time-based token hassle of the Meta Llama2 repo. I've successfully used it in the past.

Also, similar Elixir code is working with Mistral and Zephyr models.

zblanco commented 7 months ago

Ran into a similar issue loading a fine-tuned Mistral model:

{:ok, model_info} = Bumblebee.load_model({:hf, "NousResearch/Nous-Hermes-2-Mistral-7B-DPO"}, backend: {EXLA.Backend, client: :host})

Error:

** (FunctionClauseError) no function clause matching in Unpickler.load_op/3    

    The following arguments were given to Unpickler.load_op/3:

        # 1
        240

        # 2
        <<39, 0, 0, 0, 0, 0, 0, 123, 34, 95, 95, 109, 101, 116, 97, 100, 97, 116, 97, 95, 95, 34, 58, 123,
          34, 102, 111, 114, 109, 97, 116, 34, 58, 34, 112, 116, 34, 125, 44, 34, 108, 109, 95, 104, 101,
          97, 100, 46, 119, 101, ...>>

        # 3
        %{stack: [], persistent_id_resolver: nil, object_resolver: nil, refs: %{}, memo: %{}, metastack: []}

        ...

    (unpickler 0.1.0) lib/unpickler.ex:240: Unpickler.load_op/3
    (bumblebee 0.5.1) lib/bumblebee/conversion/pytorch/loader.ex:192: Bumblebee.Conversion.PyTorch.Loader.load_legacy!/1
    (bumblebee 0.5.1) lib/bumblebee/conversion/pytorch.ex:48: anonymous fn/2 in Bumblebee.Conversion.PyTorch.load_params!/4
    (elixir 1.15.6) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
    (bumblebee 0.5.1) lib/bumblebee/conversion/pytorch.ex:47: anonymous fn/4 in Bumblebee.Conversion.PyTorch.load_params!/4
    (nx 0.7.0) lib/nx.ex:4447: Nx.with_default_backend/2
    (bumblebee 0.5.1) lib/bumblebee.ex:607: Bumblebee.load_params/5

However managed to load a prior version of this model: teknium/OpenHermes-2-Mistral-7B

jonatanklosko commented 7 months ago

Fixed in 73715feb36e6a059611dbf098fe18ea039a39966. Released as v0.5.2.

elixir-nx / bumblebee

Unknown error serving Llama 2 derivative model #348