-
Hi there!
OpenAI and HF api seem to have diverged on chat_completion `response_format`. That of the [OpenAI](https://github.com/openai/openai-python/blob/37f5615da1f4360710f6f45920dbb81387d1a4c5/sr…
-
Dears,
I failed to run Llama-2-7b-chat-hf on NPU, please give me a hand.
1. I converted the mode by below command, and got two models,
a) optimum-cli export openvino --task text-generation -m Meta-…
-
### Describe the issue
according to the [Local-LLMs/](https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs/), the autogen can support multiple local llm.
my command for fastchat
First,…
-
### Prerequisite
- [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expe…
-
### Describe the bug
I got access to Llama-3.2 but when I tried to access the model, I got the 401 Error.
```
OSError: You are trying to access a gated repo.
Make sure to have access to it at …
-
Could you specify the exact model used for EgoSchema eval? The paper states that the LLM backbone used for EgoSchema eval is LLaMA-2, but README states that Vicuna weights were used. If LLaMA-2 was in…
-
1. Download weights from [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf).
2. Combine weights into a single safetensors file.
3. Convert safetensors to llama2-7b…
-
I would like to use this project with customization.
I'm wondering whether hf has a plan to release swift code too (like [swift-chat](https://github.com/huggingface/swift-chat))
-
I use this command to quantize llama2-7b-chat model, but the model size dosen't change.
CUDA_VISIBLE_DEVICES=0 python3 main.py \
--model /mnt/home/model/llama2-7b-chat-hf \
--epochs 20 --o…
-
I'm trying to run a 70B model on my Jetson AGX Orin(64x64GB), but it automatically interrupts when I simply replace the 8B model. How can I get the 70B model to run?
When I run the command below, s…