elixir-nx / bumblebee

Pre-trained Neural Network models in Axon (+ 🤗 Models integration)
Apache License 2.0
1.26k stars 90 forks source link

Llama 2 derivative model errors expecting top_k to be provided #349

Closed brainlid closed 4 months ago

brainlid commented 4 months ago

Using the same code from this issue: https://github.com/elixir-nx/bumblebee/issues/348#issue-2151921198

And using Bumbelbee 0.5.2, I get the following error:

2024-02-24T15:59:05Z app[4d89d19eb99587] ord [info]** (RuntimeError) conversion failed, expected "top_k" to be a number, got: nil
2024-02-24T15:59:05Z app[4d89d19eb99587] ord [info]    (bumblebee 0.5.2) lib/bumblebee/shared/converters.ex:20: anonymous fn/3 in Bumblebee.Shared.Converters.convert!/2
2024-02-24T15:59:05Z app[4d89d19eb99587] ord [info]    (elixir 1.15.7) lib/enum.ex:2510: Enum."-reduce/3-lists^foldl/2-0-"/3
2024-02-24T15:59:05Z app[4d89d19eb99587] ord [info]    (bumblebee 0.5.2) lib/bumblebee/shared/converters.ex:14: Bumblebee.Shared.Converters.convert!/2
2024-02-24T15:59:05Z app[4d89d19eb99587] ord [info]    (bumblebee 0.5.2) lib/bumblebee/text/generation_config.ex:307: Bumblebee.HuggingFace.Transformers.Config.Bumblebee.Text.GenerationConfig.load/2
2024-02-24T15:59:05Z app[4d89d19eb99587] ord [info]    (bumblebee 0.5.2) lib/bumblebee.ex:1039: Bumblebee.load_generation_config/2
2024-02-24T15:59:05Z app[4d89d19eb99587] ord [info]    (harness 0.1.0) lib/harness/llama_2_chat_with_functions.ex:15: Harness.Llama2ChatFunctions.serving/0
2024-02-24T15:59:05Z app[4d89d19eb99587] ord [info]    (harness 0.1.0) lib/harness/delayed_serving.ex:42: anonymous fn/2 in Harness.DelayedServing.init/1

The code of top_p was taken from the Llama notebook.

If a top_k is expected, I just need to know the value to provide.

jonatanklosko commented 4 months ago

Fixed in 3bca4c5f4d307bb289de170dac489b1cb572741e. This time I double-checked that the whole serving runs on the GPU :)

jonatanklosko commented 4 months ago

Released in v0.5.3.